facet.data.partition.RangePartitioner#

class facet.data.partition.RangePartitioner(max_partitions=None)[source]#

Abstract base class of partitioners for numerical ranges.

Bases

Partitioner [~T_Values_Numeric]

Generic types

~T_Values_Numeric(int64, float64), ~T_Values_Scalar(int, float)

Metaclasses

ABCMeta

Parameters

max_partitions (Optional[int]) – the maximum number of partitions to generate; must be at least 2 (default: 20)

Method summary

fit

Calculate the partitioning for the given observed values.

Attribute summary

DEFAULT_MAX_PARTITIONS

frequencies_

The count of values allocated to each partition.

is_categorical

False

is_fitted

True if this object is fitted, False otherwise.

max_partitions

The maximum number of partitions to be generated by this partitioner.

partition_bounds_

Return the endpoints of the intervals that delineate each partition.

partition_width_

The width of each partition.

partitions_

The values representing the partitions.

Definitions

fit(values, *, lower_bound=None, upper_bound=None, **fit_params)[source]#

Calculate the partitioning for the given observed values.

The lower and upper bounds of the range to be partitioned can be provided as optional arguments. If no bounds are provided, the partitioner automatically chooses the lower and upper outlier thresholds based on the Tukey test, i.e., \([- 1.5 \cdot \mathit{iqr}, 1.5 \cdot \mathit{iqr}]\) where \(\mathit{iqr}\) is the inter-quartile range.

Parameters
  • values (ndarray[Any, dtype[TypeVar(T_Values_Numeric, int64, float64)]]) – a sequence of observed values as the empirical basis for calculating the partitions

  • lower_bound (Union[TypeVar(T_Values_Numeric, int64, float64), float, int, None]) – the inclusive lower bound of the elements to partition

  • upper_bound (Union[TypeVar(T_Values_Numeric, int64, float64), float, int, None]) – the inclusive upper bound of the elements to partition

  • fit_params (Any) – optional fitting parameters

Return type

RangePartitioner

Returns

self

property frequencies_: numpy.ndarray[Any, numpy.dtype[numpy.int64]]#

The count of values allocated to each partition.

Return type

ndarray[Any, dtype[int64]]

property is_categorical: bool#

False

Return type

bool

property is_fitted: bool#

True if this object is fitted, False otherwise.

Return type

bool

property max_partitions: int#

The maximum number of partitions to be generated by this partitioner.

Return type

int

property partition_bounds_: Sequence[Tuple[T_Values_Scalar, T_Values_Scalar]]#

Return the endpoints of the intervals that delineate each partition.

Return type

Sequence[Tuple[TypeVar(T_Values_Scalar, int, float), TypeVar(T_Values_Scalar, int, float)]]

Returns

sequence of tuples (x, y) for every partition, where x is the inclusive lower bound of a partition range, and y is the exclusive upper bound of a partition range

property partition_width_: T_Values_Scalar#

The width of each partition.

Return type

TypeVar(T_Values_Scalar, int, float)

property partitions_: Sequence[T_Values]#

The values representing the partitions.

Return type

Sequence[TypeVar(T_Values, bound= generic)]