sklearndf.transformation.extra.LeshyDF#

class sklearndf.transformation.extra.LeshyDF(estimator, n_estimators=1000, perc=90, alpha=0.05, importance='shap', two_step=True, max_iter=100, random_state=None, verbose=0, keep_weak=False)[source]#

This is an improved version of BorutaPy which itself is an improved Python implementation of the Boruta R package. Boruta is an all relevant feature selection method, while most other are minimal optimal; this means it tries to find all features carrying information usable for prediction, rather than finding a possibly compact subset of features on which some estimator has a minimal error. Why bother with all relevant feature selection? When you try to understand the phenomenon that made your data, you should care about all factors that contribute to it, not just the bluntest signs of it in context of your methodology (minimal optimal set of features by definition depends on your estimator choice).

Note

This class is a wrapper around class arfs.feature_selection.allrelevant.Leshy. It provides enhanced support for pandas data frames, and otherwise delegates all attribute access and method calls to an associated Leshy instance.

Bases:: ARFSWrapperDF [Leshy]
Metaclasses:: EstimatorWrapperDFMeta, ABCMeta

Method summary

`clone`	Make an unfitted clone of this estimator.
`fit`	Fit this estimator using the given inputs.
`fit_transform`	Fit this transformer using the given inputs, then transform the inputs.
`from_fitted`	Make a new wrapped DF estimator, delegating to a given native estimator that has already been fitted.
`get_metadata_routing`	See `sklearn.utils.get_metadata_routing()`
`get_params`	Get the parameters for this estimator.
`inverse_transform`	Inverse-transform the given inputs.
`plot_importance`	See `arfs.feature_selection.allrelevant.Leshy.plot_importance()`
`select_features`	See `arfs.feature_selection.allrelevant.Leshy.select_features()`
`set_output`	See `sklearn.utils.set_output()`
`set_params`	Set the parameters of this estimator.
`to_expression`	Render this object as an expression.
`transform`	Transform the given inputs.

Attribute summary

`COL_FEATURE`	Name assigned to an `Index` or a `Series` with the names of the features used to fit a `EstimatorDF`.
`COL_FEATURE_ORIGINAL`	Name assigned to a `Series` with the original feature names before transformation.
`feature_names_in_`	The pandas column index with the names of the features used to fit this estimator.
`feature_names_original_`	A pandas series, mapping the output features resulting from the transformation to the original input features.
`feature_names_out_`	A pandas column index with the names of the features produced by this transformer
`is_fitted`	`True` if this object is fitted, `False` otherwise.
`n_features_in_`	The number of features used to fit this estimator.
`n_outputs_`	The number of outputs used to fit this estimator.
`native_estimator`	The native estimator that this wrapper delegates to.
`output_names_`	The name(s) of the output(s) this estimator was fitted to, or `None` if this estimator was not fitted to any outputs.

Definitions

clone()#

Make an unfitted clone of this estimator.

Return type:: LeshyDF
Returns:: the unfitted clone

fit(X, y=None, **fit_params)#

Fit this estimator using the given inputs.

Parameters:

X (Union[DataFrame, Series]) – input data frame with observations as rows and features as columns
y (Union[Series, DataFrame, None]) – an optional series or data frame with one or more outputs
fit_params (Any) – additional keyword parameters as required by specific estimator implementations

Return type:

LeshyDF

Returns:

self

fit_transform(X, y=None, **fit_params)#

Fit this transformer using the given inputs, then transform the inputs.

Parameters:

X (Union[Series, DataFrame]) – input data frame with observations as rows and features as columns
y (Optional[Series]) – an optional series or data frame with one or more outputs
fit_params (Any) – additional keyword parameters as required by specific transformer implementations

Return type:

DataFrame

Returns:

the transformed inputs

classmethod from_fitted(estimator, features_in, n_outputs)#

Make a new wrapped DF estimator, delegating to a given native estimator that has already been fitted.

Parameters:

estimator (Leshy) – the fitted native estimator to use as the delegate
features_in (Index) – the column names of X used for fitting the estimator
n_outputs (int) – the number of outputs in y used for fitting the estimator

Return type:

LeshyDF

Returns:

the wrapped data frame estimator

get_metadata_routing()#: See sklearn.utils.get_metadata_routing()

get_params(deep=True)#

Get the parameters for this estimator.

Parameters:: deep (bool) – if True, return the parameters for this estimator, and for any sub-estimators contained in this estimator
Return type:: Mapping[str, Any]
Returns:: a mapping of parameter names to their values

inverse_transform(X)#

Inverse-transform the given inputs.

The inputs must have the same features as the inputs used to fit this transformer. The features can be provided in any order since they are identified by their column names.

Parameters:: X (Union[Series, DataFrame]) – input data frame with observations as rows and features as columns
Return type:: DataFrame
Returns:: the reverse-transformed inputs

plot_importance(n_feat_per_inch=5)#: See arfs.feature_selection.allrelevant.Leshy.plot_importance()

select_features(X, y, sample_weight=None)#: See arfs.feature_selection.allrelevant.Leshy.select_features()

set_output(*, transform=None)#: See sklearn.utils.set_output()

set_params(**params)#

Set the parameters of this estimator.

Valid parameter keys can be obtained by calling get_params().

Parameters:: params (Any) – the estimator parameters to set
Return type:: LeshyDF
Returns:: self

to_expression()#

Render this object as an expression.

Return type:: Expression
Returns:: the expression representing this object

transform(X)#

Transform the given inputs.

The inputs must have the same features as the inputs used to fit this transformer. The features can be provided in any order since they are identified by their column names.

Parameters:: X (Union[Series, DataFrame]) – input data frame with observations as rows and features as columns
Return type:: DataFrame
Returns:: the transformed inputs

COL_FEATURE = 'feature'#

Name assigned to an Index or a Series with the names of the features used to fit a EstimatorDF.

See feature_names_in_() and feature_names_original_().

COL_FEATURE_ORIGINAL = 'feature_original'#

Name assigned to a Series with the original feature names before transformation.

See feature_names_original_().

property feature_names_in_: Index#

The pandas column index with the names of the features used to fit this estimator.

Raises:: AttributeError – this estimator is not fitted

property feature_names_original_: Series#

A pandas series, mapping the output features resulting from the transformation to the original input features.

The index of the resulting series consists of the names of the output features; the corresponding values are the names of the original input features.

Raises:: AttributeError – this transformer is not fitted

property feature_names_out_: Index#

A pandas column index with the names of the features produced by this transformer

Raises:: AttributeError – this transformer is not fitted

property is_fitted: bool#: True if this object is fitted, False otherwise.

property n_features_in_: int#

The number of features used to fit this estimator.

Raises:: AttributeError – this estimator is not fitted

property n_outputs_: int#

The number of outputs used to fit this estimator.

Raises:: AttributeError – this estimator is not fitted

property native_estimator: LeshyDF#: The native estimator that this wrapper delegates to.

property output_names_: list[str] | None#

The name(s) of the output(s) this estimator was fitted to, or None if this estimator was not fitted to any outputs.

Raises:: AttributeError – this estimator is not fitted

sklearndf.transformation.extra.LeshyDF#

This Page