facet.data.Sample#
- class facet.data.Sample(observations, *, target_name, feature_names=None, weight_name=None)[source]#
A collection of observations, comprising features, one or more target variables and optional sample weights.
A
Sample
object serves to keep features, targets and weights aligned, ensuring a more readable and robust ML workflow. It provides basic methods for accessing features, targets and weights, and for selecting subsets of features and observations.The underlying data structure is a
DataFrame
.Supports
len()
, returning the number of observations in this sample.- Parameters
observations (
DataFrame
) – a table of observational data; each row represents one observation, names of all used columns must be stringstarget_name (
Union
[str
,Iterable
[str
]]) – the name of the column representing the target variable; or an iterable of names representing multiple targetsfeature_names (
Optional
[Iterable
[str
]]) – optional iterable of strings naming the columns that represent features; if omitted, all non-target and non-weight columns are considered featuresweight_name (
Optional
[str
]) – optional name of a column representing the weight of each observation
Method summary
Return a copy of this sample, dropping the features with the given names.
Return a new sample which only includes the features with the given names.
Return a new sample with a subset of this sample's observations.
Attribute summary
Default name for the feature index (= column index) used when returning a features table.
Default name for the observations index (= row index) of the underlying data frame.
Default name for the target series or target index (= column index) used when returning the targets.
The column names of all features in this sample.
The features for all observations.
Row index of all observations in this sample.
The target variable(s) for all observations.
The column name of the target in this sample, or a list of column names if this sample has multiple targets.
A series indicating the weight for each observation;
None
if no weights are defined.The column name of weights in this sample;
None
if no weights are defined.Definitions
- drop(*, feature_names)[source]#
Return a copy of this sample, dropping the features with the given names.
- Parameters
feature_names (
Union
[str
,Collection
[str
]]) – name(s) of the features to be dropped- Return type
- Returns
copy of this sample, excluding the features with the given names
- keep(*, feature_names)[source]#
Return a new sample which only includes the features with the given names.
- subsample(*, loc=None, iloc=None)[source]#
Return a new sample with a subset of this sample’s observations.
Select observations either by indices (
loc
), or integer indices (iloc
). Exactly one of both arguments must be provided when calling this method, not both or none.
- IDX_FEATURE = 'feature'#
Default name for the feature index (= column index) used when returning a features table.
- IDX_OBSERVATION = 'observation'#
Default name for the observations index (= row index) of the underlying data frame.
- IDX_TARGET = 'target'#
Default name for the target series or target index (= column index) used when returning the targets.
- property features: pandas.DataFrame#
The features for all observations.
- Return type
- property index: pandas.Index#
Row index of all observations in this sample.
- Return type
- property target: Union[pandas.Series, pandas.DataFrame]#
The target variable(s) for all observations.
Represented as a series if there is only a single target, or as a data frame if there are multiple targets.
- property target_name: Union[str, List[str]]#
The column name of the target in this sample, or a list of column names if this sample has multiple targets.
- property weight: Optional[pandas.Series]#
A series indicating the weight for each observation;
None
if no weights are defined.