sklearndf.transformation.extra.LeshyDF#
- class sklearndf.transformation.extra.LeshyDF(estimator, n_estimators=1000, perc=90, alpha=0.05, importance='shap', two_step=True, max_iter=100, random_state=None, verbose=0, keep_weak=False)[source]#
This is an improved version of BorutaPy which itself is an improved Python implementation of the Boruta R package. Boruta is an all relevant feature selection method, while most other are minimal optimal; this means it tries to find all features carrying information usable for prediction, rather than finding a possibly compact subset of features on which some estimator has a minimal error. Why bother with all relevant feature selection? When you try to understand the phenomenon that made your data, you should care about all factors that contribute to it, not just the bluntest signs of it in context of your methodology (minimal optimal set of features by definition depends on your estimator choice).
Note
This class is a wrapper around class
arfs.feature_selection.allrelevant.Leshy
. It provides enhanced support forpandas
data frames, and otherwise delegates all attribute access and method calls to an associatedLeshy
instance.- Bases
ARFSWrapperDF
[Leshy
]- Metaclasses
Method summary
Make an unfitted clone of this estimator.
Fit this estimator using the given inputs.
Fit this transformer using the given inputs, then transform the inputs.
Make a new wrapped DF estimator, delegating to a given native estimator that has already been fitted.
Get the parameters for this estimator.
Inverse-transform the given inputs.
See
arfs.feature_selection.allrelevant.Leshy.plot_importance()
See
arfs.feature_selection.allrelevant.Leshy.select_features()
See
sklearn.utils.set_output()
Set the parameters of this estimator.
Render this object as an expression.
Transform the given inputs.
Attribute summary
COL_FEATURE
Name assigned to an
Index
or aSeries
with the names of the features used to fit aEstimatorDF
.COL_FEATURE_ORIGINAL
Name assigned to a
Series
with the original feature names before transformation.The pandas column index with the names of the features used to fit this estimator.
A pandas series, mapping the output features resulting from the transformation to the original input features.
A pandas column index with the names of the features produced by this transformer
True
if this object is fitted,False
otherwise.The number of features used to fit this estimator.
The number of outputs used to fit this estimator.
The native estimator that this wrapper delegates to.
The name(s) of the output(s) this estimator was fitted to, or
None
if this estimator was not fitted to any outputs.Definitions
- fit(X, y=None, **fit_params)#
Fit this estimator using the given inputs.
- Parameters
- Return type
- Returns
self
- fit_transform(X, y=None, **fit_params)#
Fit this transformer using the given inputs, then transform the inputs.
- Parameters
- Return type
- Returns
the transformed inputs
- classmethod from_fitted(estimator, features_in, n_outputs)#
Make a new wrapped DF estimator, delegating to a given native estimator that has already been fitted.
- get_params(deep=True)#
Get the parameters for this estimator.
- inverse_transform(X)#
Inverse-transform the given inputs.
The inputs must have the same features as the inputs used to fit this transformer. The features can be provided in any order since they are identified by their column names.
- plot_importance(n_feat_per_inch=5)#
See
arfs.feature_selection.allrelevant.Leshy.plot_importance()
- select_features(X, y, sample_weight=None)#
See
arfs.feature_selection.allrelevant.Leshy.select_features()
- set_output(*, transform=None)#
See
sklearn.utils.set_output()
- set_params(**params)#
Set the parameters of this estimator.
Valid parameter keys can be obtained by calling
get_params()
.
- to_expression()#
Render this object as an expression.
- Return type
- Returns
the expression representing this object
- transform(X)#
Transform the given inputs.
The inputs must have the same features as the inputs used to fit this transformer. The features can be provided in any order since they are identified by their column names.
- property feature_names_in_: pandas.Index#
The pandas column index with the names of the features used to fit this estimator.
- Raises
AttributeError – this estimator is not fitted
- Return type
- property feature_names_original_: pandas.Series#
A pandas series, mapping the output features resulting from the transformation to the original input features.
The index of the resulting series consists of the names of the output features; the corresponding values are the names of the original input features.
- Raises
AttributeError – this transformer is not fitted
- Return type
- property feature_names_out_: pandas.Index#
A pandas column index with the names of the features produced by this transformer
- Raises
AttributeError – this transformer is not fitted
- Return type
- property n_features_in_: int#
The number of features used to fit this estimator.
- Raises
AttributeError – this estimator is not fitted
- Return type
- property n_outputs_: int#
The number of outputs used to fit this estimator.
- Raises
AttributeError – this estimator is not fitted
- Return type
- property native_estimator: sklearndf.wrapper.T_NativeEstimator#
The native estimator that this wrapper delegates to.
- Return type
TypeVar
(T_NativeEstimator
, bound=BaseEstimator
)