facet.inspection.shap.ShapCalculator#

class facet.inspection.shap.ShapCalculator(model, *, explainer_factory, interaction_values, n_jobs=None, shared_memory=None, pre_dispatch=None, verbose=None)[source]#

Base class for all SHAP calculators.

A SHAP calculator uses the shap package to calculate SHAP tensors for all observations in a given sample of feature values, then consolidates and aggregates results in a data frame.

Bases:

ParallelizableMixin, FittableMixin [DataFrame]

Generic types:

~T_Model

Metaclasses:

ABCMeta

Parameters:
  • model (ShapCalculator) – the model for which to calculate SHAP values

  • explainer_factory (ExplainerFactory[ShapCalculator]) – the explainer factory used to create the SHAP explainer for this calculator

  • interaction_values (bool) – if True, calculate SHAP interaction values, otherwise calculate SHAP values

  • n_jobs (Optional[int]) – number of jobs to use in parallel; if None, use joblib default (default: None)

  • shared_memory (Optional[bool]) – if True, use threads in the parallel runs; if False or None, use multiprocessing (default: None)

  • pre_dispatch (Union[str, int, None]) – number of batches to pre-dispatch; if None, use joblib default (default: None)

  • verbose (Optional[int]) – verbosity level used in the parallel computation; if None, use joblib default (default: None)

Method summary

fit

Calculate the SHAP values.

validate_features

Check that the given feature matrix is valid for this calculator.

Attribute summary

IDX_FEATURE

Name for the feature index (= column index) of the resulting SHAP data frame.

MULTI_OUTPUT_INDEX_NAME

Name of the index that is used to identify multiple outputs for which SHAP values are calculated.

input_names

The names of the inputs explained by this SHAP calculator, or None if no names are defined.

is_fitted

[see superclass]

main_effects

The main effects per observation and featuren (i.e., the diagonals of the interaction matrices), with shape \((n_\mathrm{observations}, n_\mathrm{outputs} \cdot n_\mathrm{features})\).

output_names

The names of the outputs explained by this SHAP calculator.

shap_interaction_values

The SHAP interaction values per observation and feature pair, with shape \((n_\mathrm{observations} \cdot n_\mathrm{features}, n_\mathrm{outputs} \cdot n_\mathrm{features})\)

shap_values

The SHAP values per observation and feature, with shape \((n_\mathrm{observations}, n_\mathrm{outputs} \cdot n_\mathrm{features})\)

n_jobs

Number of jobs to use in parallel; if None, use joblib default.

shared_memory

If True, use threads in the parallel runs; if False or None, use multiprocessing.

pre_dispatch

Number of batches to pre-dispatch; if None, use joblib default.

verbose

Verbosity level used in the parallel computation; if None, use joblib default.

model

The model for which to calculate SHAP values.

explainer_factory

The explainer factory used to create the SHAP explainer for this calculator.

shap_

The SHAP values for all observations this calculator has been fitted to.

feature_index_

The names of the features for which SHAP values were calculated.

Definitions

fit(__X, **fit_params)[source]#

Calculate the SHAP values.

Parameters:
  • __X (DataFrame) – the observations for which to calculate SHAP values

  • fit_params (Any) – additional fit parameters (unused)

Return type:

ShapCalculator

Returns:

self

Raises:

ValueError – if the observations are not a valid feature matrix for this calculator

validate_features(features)[source]#

Check that the given feature matrix is valid for this calculator.

Parameters:

features (DataFrame) – the feature matrix to validate

Raises:

ValueError – if the feature matrix is not compatible with this calculator

Return type:

None

IDX_FEATURE = 'feature'#

Name for the feature index (= column index) of the resulting SHAP data frame.

MULTI_OUTPUT_INDEX_NAME = 'output'#

Name of the index that is used to identify multiple outputs for which SHAP values are calculated. To be overloaded by subclasses.

explainer_factory: ExplainerFactory[TypeVar(T_Model)]#

The explainer factory used to create the SHAP explainer for this calculator.

feature_index_: Optional[Index]#

The names of the features for which SHAP values were calculated.

abstract property input_names: list[str] | None#

The names of the inputs explained by this SHAP calculator, or None if no names are defined.

property is_fitted: bool#

[see superclass]

property main_effects: DataFrame#

The main effects per observation and featuren (i.e., the diagonals of the interaction matrices), with shape \((n_\mathrm{observations}, n_\mathrm{outputs} \cdot n_\mathrm{features})\).

Raises:

AttributeError – this SHAP calculator does not support interaction values

model: TypeVar(T_Model)#

The model for which to calculate SHAP values.

n_jobs: Optional[int]#

Number of jobs to use in parallel; if None, use joblib default.

abstract property output_names: list[str]#

The names of the outputs explained by this SHAP calculator.

pre_dispatch: Union[str, int, None]#

Number of batches to pre-dispatch; if None, use joblib default.

shap_: Optional[DataFrame]#

The SHAP values for all observations this calculator has been fitted to.

property shap_interaction_values: DataFrame#

The SHAP interaction values per observation and feature pair, with shape \((n_\mathrm{observations} \cdot n_\mathrm{features}, n_\mathrm{outputs} \cdot n_\mathrm{features})\)

Raises:

AttributeError – this SHAP calculator does not support interaction values

property shap_values: DataFrame#

The SHAP values per observation and feature, with shape \((n_\mathrm{observations}, n_\mathrm{outputs} \cdot n_\mathrm{features})\)

shared_memory: Optional[bool]#

If True, use threads in the parallel runs; if False or None, use multiprocessing.

verbose: Optional[int]#

Verbosity level used in the parallel computation; if None, use joblib default.