Mixture
- class sklego.mixture.BayesianGMMClassifier(n_components=1, covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params='kmeans', weight_concentration_prior_type='dirichlet_process', weight_concentration_prior=None, mean_precision_prior=None, mean_prior=None, degrees_of_freedom_prior=None, covariance_prior=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10)[source]
Bases:
sklearn.base.BaseEstimator
,sklearn.base.ClassifierMixin
- fit(X: numpy.array, y: numpy.array) sklego.mixture.bayesian_gmm_classifier.BayesianGMMClassifier [source]
Fit the model using X, y as training data.
- Parameters
X – array-like, shape=(n_columns, n_samples, ) training data.
y – array-like, shape=(n_samples, ) training data.
- Returns
Returns an instance of self.
- class sklego.mixture.BayesianGMMOutlierDetector(threshold=0.99, method='quantile', n_components=1, covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params='kmeans', weight_concentration_prior_type='dirichlet_process', weight_concentration_prior=None, mean_precision_prior=None, mean_prior=None, degrees_of_freedom_prior=None, covariance_prior=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10)[source]
Bases:
sklearn.base.OutlierMixin
,sklearn.base.BaseEstimator
The GMMDetector trains a Bayesian Gaussian Mixture Model on a dataset X. Once a density is trained we can evaluate the likelihood scores to see if it is deemed likely. By giving a threshold this model might then label outliers if their likelihood score is too low.
- Parameters
threshold – the limit at which the model thinks an outlier appears, must be between (0, 1)
method – the method that the threshold will be applied to, possible values = [stddev, default=quantile]
If you select method=”quantile” then the threshold value represents the quantile value to start calling something an outlier.
If you select method=”stddev” then the threshold value represents the numbers of standard deviations before calling something an outlier.
There are other settings too, these are best described in the BayesianGaussianMixture documentation found here:
https://scikit-learn.org/stable/modules/generated/sklearn.mixture.BayesianGaussianMixture.html.
- fit(X: numpy.array, y=None) sklego.mixture.bayesian_gmm_detector.BayesianGMMOutlierDetector [source]
Fit the model using X, y as training data.
- Parameters
X – array-like, shape=(n_columns, n_samples,) training data.
y – ignored but kept in for pipeline support
- Returns
Returns an instance of self.
- class sklego.mixture.GMMClassifier(n_components=1, covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params='kmeans', weights_init=None, means_init=None, precisions_init=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10)[source]
Bases:
sklearn.base.BaseEstimator
,sklearn.base.ClassifierMixin
- fit(X: numpy.array, y: numpy.array) sklego.mixture.gmm_classifier.GMMClassifier [source]
Fit the model using X, y as training data.
- Parameters
X – array-like, shape=(n_columns, n_samples, ) training data.
y – array-like, shape=(n_samples, ) training data.
- Returns
Returns an instance of self.
- class sklego.mixture.GMMOutlierDetector(threshold=0.99, method='quantile', n_components=1, covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params='kmeans', weights_init=None, means_init=None, precisions_init=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10)[source]
Bases:
sklearn.base.OutlierMixin
,sklearn.base.BaseEstimator
The GMMDetector trains a Gaussian Mixture Model on a dataset X. Once a density is trained we can evaluate the likelihood scores to see if it is deemed likely. By giving a threshold this model might then label outliers if their likelihood score is too low.
- Parameters
threshold – the limit at which the model thinks an outlier appears, must be between (0, 1)
method – the method that the threshold will be applied to, possible values = [stddev, default=quantile]
If you select method=”quantile” then the threshold value represents the quantile value to start calling something an outlier.
If you select method=”stddev” then the threshold value represents the numbers of standard deviations before calling something an outlier.
- fit(X: numpy.array, y=None) sklego.mixture.gmm_outlier_detector.GMMOutlierDetector [source]
Fit the model using X, y as training data.
- Parameters
X – array-like, shape=(n_columns, n_samples,) training data.
y – ignored but kept in for pipeline support
- Returns
Returns an instance of self.