facet.data.partition.ContinuousRangePartitioner#

class facet.data.partition.ContinuousRangePartitioner(max_partitions=None)[source]#

Partition numerical values in adjacent intervals of the same length.

The range of intervals and interval size is computed based on attributes max_partitions, lower_bound, and upper_bound.

Partition boundaries and interval sized are chosen with interpretability in mind and are always a power of 10, or a multiple of 2 or 5 of a power of 10, e.g. 0.1, 0.2, 0.5, 1.0, 2.0, 5.0, and so on.

The intervals also satisfy the following conditions:

lower_bound is within the first interval
upper_bound is within the last interval

For example, with max_partitions = 10, lower_bound = 3.3, and upper_bound = 4.7, the resulting partitioning would be: [3.2, 3.4), [3.4, 3.6), [3.6, 3.8), [4.0, 4.2), [4.4, 4.6), [4.6, 4.8]

Bases: RangePartitioner [float64, float]
Metaclasses: ABCMeta
Parameters: max_partitions (Optional[int]) – the maximum number of partitions to generate; must be at least 2 (default: 20)

Method summary

fit

Calculate the partitioning for the given observed values.

Attribute summary

`DEFAULT_MAX_PARTITIONS`
`frequencies_`	The count of values allocated to each partition.
`is_categorical`	`False`
`is_fitted`	`True` if this object is fitted, `False` otherwise.
`max_partitions`	The maximum number of partitions to be generated by this partitioner.
`partition_bounds_`	Return the endpoints of the intervals that delineate each partition.
`partition_width_`	The width of each partition.
`partitions_`	The values representing the partitions.

Definitions

fit(values, *, lower_bound=None, upper_bound=None, **fit_params)#

Calculate the partitioning for the given observed values.

The lower and upper bounds of the range to be partitioned can be provided as optional arguments. If no bounds are provided, the partitioner automatically chooses the lower and upper outlier thresholds based on the Tukey test, i.e., \([- 1.5 \cdot \mathit{iqr}, 1.5 \cdot \mathit{iqr}]\) where \(\mathit{iqr}\) is the inter-quartile range.

Parameters

values (ndarray[Any, dtype[float64]]) – a sequence of observed values as the empirical basis for calculating the partitions
lower_bound (Union[float64, float, int, None]) – the inclusive lower bound of the elements to partition
upper_bound (Union[float64, float, int, None]) – the inclusive upper bound of the elements to partition
fit_params (Any) – optional fitting parameters

Return type

ContinuousRangePartitioner

Returns

self

property frequencies_: numpy.ndarray[Any, numpy.dtype[numpy.int64]]#

The count of values allocated to each partition.

Return type: ndarray[Any, dtype[int64]]

property is_categorical: bool#

False

Return type: bool

property is_fitted: bool#

True if this object is fitted, False otherwise.

Return type: bool

property max_partitions: int#

The maximum number of partitions to be generated by this partitioner.

Return type: int

property partition_bounds_: Sequence[Tuple[T_Values_Scalar, T_Values_Scalar]]#

Return the endpoints of the intervals that delineate each partition.

Return type: Sequence[Tuple[TypeVar(T_Values_Scalar, int, float), TypeVar(T_Values_Scalar, int, float)]]
Returns: sequence of tuples (x, y) for every partition, where x is the inclusive lower bound of a partition range, and y is the exclusive upper bound of a partition range

property partition_width_: T_Values_Scalar#

The width of each partition.

Return type: TypeVar(T_Values_Scalar, int, float)

property partitions_: Sequence[T_Values]#

The values representing the partitions.

Return type: Sequence[TypeVar(T_Values, bound= generic)]

facet.data.partition.CategoryPartitioner

facet.data.partition.IntegerRangePartitioner