Meta

class sklego.meta.ConfusionBalancer(estimator, alpha: float = 0.5, cfm_smooth=0)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.ClassifierMixin, sklearn.base.MetaEstimatorMixin

The ConfusionBalancer attempts to give it’s child estimator a more balanced output by learning from the confusion matrix during training. The idea is that the confusion matrix calculates P(C_i | M_i) where C_i is the actual class and M_i is the class that the underlying model gives. We use these probabilities to attempt a more balanced prediction by averaging the correction from the confusion matrix with the original probabilities.

\[p( ext{class_j}) = lpha p( ext{model}_j) + (1-lpha) p( ext{class_j} | ext{model}_j) p( ext{model}_j)\]

Parameters

model – a scikit learn compatible classification model that has predict_proba
alpha – a hyperparameter between 0 and 1, determines how much to apply smoothing
cfm_smooth – a smoothing parameter for the confusion matrices to ensure zeros don’t exist

fit(X, y)[source]

Fit the data.

Parameters

X – array-like, shape=(n_columns, n_samples,) training data.
y – array-like, shape=(n_samples,) training data.

Returns

Returns an instance of self.

predict(X)[source]

Predict new data.

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: array, shape=(n_samples,) the predicted data

predict_proba(X)[source]

Predict new data, with probabilities

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: array, shape=(n_samples, n_classes) the predicted data

class sklego.meta.DecayEstimator(model, decay: float = 0.999, decay_func='exponential')[source]

Bases: sklearn.base.BaseEstimator

Morphs an estimator suchs that the training weights can be adapted to ensure that points that are far away have less weight. Note that it is up to the user to sort the dataset appropriately. This meta estimator will only work for estimators that have a “sample_weights” argument in their .fit() method.

The DecayEstimator will use exponential decay to weight the parameters.

w_{t-1} = decay * w_{t}

fit(X, y)[source]

Fit the data after adapting the same weight.

Parameters

X – array-like, shape=(n_columns, n_samples,) training data.
y – array-like, shape=(n_samples,) training data.

Returns

Returns an instance of self.

predict(X)[source]

Predict new data.

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: array, shape=(n_samples,) the predicted data

score(X, y)[source]

class sklego.meta.EstimatorTransformer(estimator, predict_func='predict')[source]

Bases: sklearn.base.TransformerMixin, sklearn.base.MetaEstimatorMixin, sklearn.base.BaseEstimator

Allows using an estimator such as a model as a transformer in an earlier step of a pipeline

Parameters

estimator – An instance of the estimator that should be used for the transformation
predict_func – The function called on the estimator when transforming e.g. (predict, predict_proba)

fit(X, y, **kwargs)[source]: Fits the estimator

transform(X)[source]

Applies the predict_func on the fitted estimator.

Returns array of shape (X.shape[0], ) if estimator is not multi output. For multi output estimators an array of shape (X.shape[0], y.shape[1]) is returned.

sklego.meta.GroupedEstimator(*args, **kwargs)[source]

class sklego.meta.GroupedPredictor(estimator, groups, shrinkage=None, use_global_model=True, check_X=True, **shrinkage_kwargs)[source]

Bases: sklearn.base.BaseEstimator

Construct an estimator per data group. Splits data by values of a single column and fits one estimator per such column.

Parameters

estimator – the model/pipeline to be applied per group
groups – the column(s) of the matrix/dataframe to select as a grouping parameter set
shrinkage –
How to perform shrinkage. None: No shrinkage (default) {“constant”, “min_n_obs”, “relative”} or a callable * constant: shrunk prediction for a level is weighted average of its prediction and its

parents prediction
- min_n_obs: shrunk prediction is the prediction for the smallest group with at least
  n observations in it
- relative: each group-level is weight according to its size
- function: a function that takes a list of group lengths and returns an array of the
  same size with the weights for each group
use_global_model – With shrinkage: whether to have a model over the entire input as first group Without shrinkage: whether or not to fall back to a general model in case the group parameter is not found during .predict()
check_X – Whether to validate X to be non-empty 2D array of finite values and attempt to cast X to float. If disabled, the model/pipeline is expected to handle e.g. missing, non-numeric, or non-finite values.
**shrinkage_kwargs –
keyword arguments to the shrinkage function

decision_function(X)[source]

Evaluate the decision function for the samples in X.

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: the decision function of the sample for each class in the model.

fit(X, y=None)[source]

Fit the model using X, y as training data. Will also learn the groups that exist within the dataset.

Parameters

X – array-like, shape=(n_columns, n_samples,) training data.
y – array-like, shape=(n_samples,) training data.

Returns

Returns an instance of self.

predict(X)[source]

Predict on new data.

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: array, shape=(n_samples,) the predicted data

predict_proba(X)[source]

Predict probabilities on new data.

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: array, shape=(n_samples, n_classes) the predicted data

class sklego.meta.GroupedTransformer(transformer, groups, use_global_model=True)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Construct a transformer per data group. Splits data by groups from single or multiple columns and transforms remaining columns using the transformers corresponding to the groups.

Parameters

transformer – the transformer to be applied per group
groups – the column(s) of the matrix/dataframe to select as a grouping parameter set. If None, the transformer will be applied to the entire input without grouping
use_global_model – Whether or not to fall back to a general transformation in case a group is not found during .transform()

fit(X, y=None)[source]

Fit the transformers to the groups in X

Parameters

X – Array-like with at least two columns, of which at least one corresponds to groups defined in init, and the remaining columns represent the values to transform.
y – (Optional) target variable

transform(X)[source]

Fit the transformers to the groups in X

Parameters: X – Array-like with columns corresponding to the ones in .fit()

class sklego.meta.OutlierClassifier(model)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.ClassifierMixin

Morphs an estimator that performs outlier detection into a classifier. When an outlier is detected it will output 1 and 0 otherwise. This way you can use familiar metrics again and this allows you to consider outlier models as a fraud detector.

fit(X, y=None)[source]

Fit the data after adapting the same weight.

Parameters

X – array-like, shape=(n_columns, n_samples,) training data.
y – array-like, shape=(n_samples,) training data.

Returns

Returns an instance of self.

predict(X)[source]

Predict new data.

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: array, shape=(n_samples,) the predicted data

predict_proba(X)[source]

Predict probability estimates for new data.

Parameters: X – array-like, shape=(n_columns, n_samples,) input data.
Returns: array, shape=(n_samples,) the predicted data

class sklego.meta.OutlierRemover(*args, **kwargs)[source]

Bases: sklego.common.TrainOnlyTransformerMixin, sklearn.base.BaseEstimator

Removes outliers (train-time only) using the supplied removal model.

Parameters

outlier_detector – must implement fit and predict methods
refit – If True, fits the estimator during pipeline.fit().

fit(X, y=None)[source]: Calculates the hash of X_train

transform_train(X)[source]

class sklego.meta.RegressionOutlierDetector(model, column, lower=2, upper=2, method='sd')[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.OutlierMixin

Morphs a regression model into one that can detect outliers. We will try to predict column in X.

fit(X, y=None)[source]

Fit the data after adapting the same weight.

Parameters

X – array-like, shape=(n_columns, n_samples,) training data.
y – array-like, shape=(n_samples,) training data.

Returns

Returns an instance of self.

predict(X, y=None)[source]

Predict new data.

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: array, shape=(n_samples,) the predicted data

score_samples(X, y=None)[source]

to_x_y(X)[source]

class sklego.meta.SubjectiveClassifier(estimator, prior, evidence='both')[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.ClassifierMixin, sklearn.base.MetaEstimatorMixin

Corrects predictions of the inner classifier by taking into account a (subjective) prior distribution of the classes.

This can be useful when there is a difference in class distribution between the training data set and the real world. Using the confusion matrix of the inner classifier and the prior, the posterior probability for a class, given the prediction of the inner classifier, can be computed. The background for this posterior estimation is given in this article <https://lucdemortier.github.io/articles/16/PerformanceMetrics>_.

Based on the evidence attribute, this meta estimator’s predictions are based on simple weighing of the inner estimator’s predict_proba() results, the posterior probabilities based on the confusion matrix, or a combination of the two approaches.

Parameters

estimator – An sklearn-compatible classifier estimator
prior – A dict of class->frequency representing the prior (a.k.a. subjective real-world) class

distribution. The class frequencies should sum to 1. :param evidence: A string indicating which evidence should be used to correct the inner estimator’s predictions. Should be one of ‘predict_proba’, ‘confusion_matrix’, or ‘both’ (default). If predict_proba, the inner estimator’s predict_proba() results are multiplied by the prior distribution. In case of confusion_matrix, the inner estimator’s discrete predictions are converted to posterior probabilities using the prior and the inner estimator’s confusion matrix (obtained from the train data used in fit()). In case of both (default), the the inner estimator’s predict_proba() results are multiplied by the posterior probabilities.

property classes_

fit(X, y)[source]

Fits the inner estimator based on the data.

Raises a ValueError if the y vector contains classes that are not specified in the prior, or if the prior is not a valid probability distribution (i.e. does not sum to 1).

Parameters

X – array-like, shape=(n_columns, n_samples,) training data.
y – array-like, shape=(n_samples,) training data.

Returns

Returns an instance of self.

predict(X)[source]

Returns predicted class, based on the provided data.

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: array, shape=(n_samples, n_classes) the predicted data

predict_proba(X)[source]

Returns probability distribution of the class, based on the provided data.

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: array, shape=(n_samples, n_classes) the predicted data

class sklego.meta.Thresholder(model, threshold: float, refit=False)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.ClassifierMixin

Takes a two class estimator and moves the threshold. This way you might design the algorithm to only accept a certain class if the probability for it is larger than, say, 90% instead of 50%.

Parameters

model – the moddel to threshold
threshold – the actual threshold to use
refit – if True, we will always retrain the model even if it is already fitted.

If False we only refit if the original model isn’t fitted.

fit(X, y, sample_weight=None)[source]

Fit the data.

Parameters

X – array-like, shape=(n_columns, n_samples,) training data.
y – array-like, shape=(n_samples,) training data.
sample_weight – array-like, shape=(n_samples) Individual weights for each sample.

Returns

Returns an instance of self.

predict(X)[source]

Predict new data.

Parameters: X – array-like, shape=(n_columns, n_samples,) training data.
Returns: array, shape=(n_samples,) the predicted data

predict_proba(X)[source]

score(X, y)[source]

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Xarray-like of shape (n_samples, n_features): Test samples.
yarray-like of shape (n_samples,) or (n_samples, n_outputs): True labels for X.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

scorefloat: Mean accuracy of self.predict(X) wrt. y.

class sklego.meta.ZeroInflatedRegressor(classifier, regressor)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.RegressorMixin

A meta regressor for zero-inflated datasets, i.e. the targets contain a lot of zeroes.

ZeroInflatedRegressor consists of a classifier and a regressor.

The classifier’s task is to find of if the target is zero or not.

The regressor’s task is to output a (usually positive) prediction whenever the classifier indicates that the

there should be a non-zero prediction.

The regressor is only trained on examples where the target is non-zero, which makes it easier for it to focus.

At prediction time, the classifier is first asked if the output should be zero. If yes, output zero. Otherwise, ask the regressor for its prediction and output it.

classifierAny, scikit-learn classifier: A classifier that answers the question “Should the output be zero?”.
regressorAny, scikit-learn regressor: A regressor for predicting the target. Its prediction is only used if classifier says that the output is non-zero.

>>> import numpy as np
>>> from sklearn.ensemble import ExtraTreesClassifier, ExtraTreesRegressor
>>> np.random.seed(0)
>>> X = np.random.randn(10000, 4)
>>> y = ((X[:, 0]>0) & (X[:, 1]>0)) * np.abs(X[:, 2] * X[:, 3]**2)
>>> z = ZeroInflatedRegressor(
... classifier=ExtraTreesClassifier(random_state=0),
... regressor=ExtraTreesRegressor(random_state=0)
... )
>>> z.fit(X, y)
ZeroInflatedRegressor(classifier=ExtraTreesClassifier(random_state=0),
                      regressor=ExtraTreesRegressor(random_state=0))
>>> z.predict(X)[:5]
array([4.91483294, 0.        , 0.        , 0.04941909, 0.        ])

fit(X, y, sample_weight=None)[source]

Fit the model.

Xnp.ndarray of shape (n_samples, n_features): The training data.
ynp.ndarray, 1-dimensional: The target values.
sample_weightOptional[np.array], default=None: Individual weights for each sample.

ZeroInflatedRegressor: Fitted regressor.

ValueError: If classifier is not a classifier or regressor is not a regressor.

predict(X)[source]

Get predictions.

Xnp.ndarray, shape (n_samples, n_features): Samples to get predictions of.

ynp.ndarray, shape (n_samples,): The predicted values.