facet.selection.LearnerSelector#

class facet.selection.LearnerSelector(searcher_type, parameter_space, cv=None, scoring=None, n_jobs=None, shared_memory=None, pre_dispatch=None, verbose=None, **searcher_params)[source]#

Select the best model obtained by fitting an estimator using different choices of hyperparameters from one or more ParameterSpace objects.

Bases

ParallelizableMixin, FittableMixin [Sample]

Generic types

~T_EstimatorDF(bound= EstimatorDF), ~T_SearchCV(bound= BaseSearchCV)

Metaclasses

ABCMeta

Parameters
  • searcher_type (Callable[..., TypeVar(T_SearchCV, bound= BaseSearchCV)]) – a cross-validation searcher class, or any other callable that instantiates a cross-validation searcher

  • parameter_space (Union[ParameterSpace[TypeVar(T_EstimatorDF, bound= EstimatorDF)], MultiEstimatorParameterSpace[TypeVar(T_EstimatorDF, bound= EstimatorDF)], Iterable[ParameterSpace[TypeVar(T_EstimatorDF, bound= EstimatorDF)]]]) – one or more parameter spaces to search; when passing multiple parameter spaces as an iterable, they are combined into a MultiEstimatorParameterSpace

  • cv (Optional[BaseCrossValidator]) – the cross-validator to be used by the searcher (e.g., RepeatedKFold)

  • scoring (Union[str, Callable[[EstimatorDF, Series, Series], float], None]) – a scoring function (by name, or as a callable) to be used by the searcher (optional; use learner’s default scorer if not specified here). If passing a callable, "score" will be used as the name of the scoring function unless the callable defines a __name__ attribute

  • n_jobs (Optional[int]) – number of jobs to use in parallel; if None, use joblib default (default: None)

  • shared_memory (Optional[bool]) – if True, use threads in the parallel runs; if False or None, use multiprocessing (default: None)

  • pre_dispatch (Union[int, str, None]) – number of batches to pre-dispatch; if None, use joblib default (default: None)

  • verbose (Optional[int]) – verbosity level used in the parallel computation; if None, use joblib default (default: None)

  • searcher_params (Any) – additional parameters to be passed on to the searcher; must not include the first two positional arguments of the searcher constructor used to pass the estimator and the search space, since these will be populated from arg parameter_space

Method summary

fit

Search this learner selector's parameter space to identify the model with the best-performing hyperparameter combination, using the given sample to fit and score the candidate estimators.

summary_report

Create a summary table of the scores achieved by all learners in the grid search, sorted by ranking score in descending order.

Attribute summary

best_estimator_

The model which obtained the best ranking score, fitted on the entire sample.

is_fitted

True if this object is fitted, False otherwise.

n_jobs

Number of jobs to use in parallel; if None, use joblib default.

shared_memory

If True, use threads in the parallel runs; if False or None, use multiprocessing.

pre_dispatch

Number of batches to pre-dispatch; if None, use joblib default.

verbose

Verbosity level used in the parallel computation; if None, use joblib default.

searcher_type

A cross-validation searcher class, or any other callable that instantiates a cross-validation searcher, wrapped in a tuple to avoid confusion with methods

parameter_space

The parameter space to search.

cv

The cross-validator to be used by the searcher.

scoring

The scoring function (by name, or as a callable) to be used by the searcher (optional; use learner's default scorer if not specified here)

searcher_params

Additional parameters to be passed on to the searcher.

searcher_

The searcher used to fit this LearnerSelector; None if not fitted.

Definitions

fit(sample, groups=None, **fit_params)[source]#

Search this learner selector’s parameter space to identify the model with the best-performing hyperparameter combination, using the given sample to fit and score the candidate estimators.

Parameters
  • sample (Sample) – the sample used to fit and score the estimators

  • groups (Union[Series, ndarray[Any, dtype[Any]], Sequence[Any], None]) – group labels for the samples used while splitting the dataset into train/test set; passed on to the fit method of the searcher

  • fit_params (Any) – parameters to pass on to the estimator’s fit method

Return type

LearnerSelector

Returns

self

summary_report(*, sort_by=None)[source]#

Create a summary table of the scores achieved by all learners in the grid search, sorted by ranking score in descending order.

Parameters

sort_by (Optional[str]) – name of the column to sort the report by, in ascending order, if the column is present (default: "rank_test_score")

Return type

DataFrame

Returns

the summary report of the grid search as a data frame

property best_estimator_: T_EstimatorDF#

The model which obtained the best ranking score, fitted on the entire sample.

Return type

TypeVar(T_EstimatorDF, bound= EstimatorDF)

cv: Optional[sklearn.model_selection.BaseCrossValidator]#

The cross-validator to be used by the searcher.

property is_fitted: bool#

True if this object is fitted, False otherwise.

Return type

bool

parameter_space: base.BaseParameterSpace[T_EstimatorDF]#

The parameter space to search.

scoring: Optional[Union[str, Callable[[sklearndf.EstimatorDF, pandas.Series, pandas.Series], float]]]#

The scoring function (by name, or as a callable) to be used by the searcher (optional; use learner’s default scorer if not specified here)

searcher_: Optional[T_SearchCV]#

The searcher used to fit this LearnerSelector; None if not fitted.

searcher_params: Dict[str, Any]#

Additional parameters to be passed on to the searcher.

searcher_type: Tuple[Callable[[...], T_SearchCV]]#

A cross-validation searcher class, or any other callable that instantiates a cross-validation searcher, wrapped in a tuple to avoid confusion with methods