discrimintools.CPLS#

class discrimintools.CPLS(n_components=2, scale=True, classes=None, max_iter=500, tol=1e-10, var_select=False, threshold=1.0, warn_message=True)[source]#

Partial Least Squares for Classification (CPLS)

The model filts a partial least squares for binary classification. Read more in the :ref:``.

Parameters:
  • n_components (int or None, default = 2) – Number of components to keep. Should be in [1, n_features].

  • scale (bool, default = True) – Whether to scale X and y.

  • classes (None, tuple or list, default = None) – Name of level in order to return. If None, classes are sorted in unique values in y.

  • max_iter (int, default = 500) – The maximum number of iterations for NIPALS method.

  • tol (float, default = 1e-06) – The tolerance used as convergence criteria in the NIPALS method.

  • var_select (bool, default = True) – Whether to applied feature selection based on variables importance in Projection for Partial Least-Squares Regression?

  • threshold (float, default = 1.0) – You can use VIP to select predictor variables when multicollinearity exists among variables. Variables with a VIP score greater than ‘threshold’ are considered important for the projection of the PLS regression.

  • warn_message (bool, default = True) – Whether to show warning messages.

Returns:

  • call_ (NamedTuple) – Call informations:

    • XtotDataFrame of shape (n_samples, n_columns)

      Input data.

    • XDataFrame of shape (n_samples, n_features)

      Training data.

    • ySeries of shape (n_samples,)

      Target values. True values for X.

    • targetstr

      Name of target.

    • featureslist

      Names of features seen during fit.

    • classeslist

      Names of classes.

    • priorsSeries of shape (n_classes,)

      Priors probabilities.

    • centerSeries of shape (n_features,)

      The average of X.

    • scaleSeries of shape (n_features,)

      The standard deviation of X.

    • n_samplesint

      Number of samples.

    • n_featuresint

      Number of features.

    • n_classesint

      Number of target values.

    • max_componentsint

      Maximum number of components.

    • n_componentsint

      Number of components kept.

    • max_iterint

      Maximum number of iterations.

    • tolfloat

      The tolerance used as convergence criteria.

    • thresholdfloat,

      The tolerance for variable importance in projection.

  • classes_ (NamedTuple) – Classes informations:

    • infosDataFrame of shape (n_classes, 3)

      class level information (frequency, proportion, prior probability).

    • coordDataFrame of shape (n_classes, n_components)

      Class coordinates.

    • euclDataFrame of shape (n_classes, n_classes)

      The squared Euclidean distance to origin.

    • genDataFrame shape (n_classes, n_classes)

      The generalized squared distance to origin.

  • coef_ (DataFrame of shape (n_features + 1, 1)) – The coefficients of the partial least squares for classification model.

  • explained_variance_ (DataFrame of shape (n_components, 2)) – The explained variance and the cumulative explained variance.

  • ind_ (NamedTuple) – Individuals informations:

    • coordDataFrame of shape (n_samples, n_components)

      The transformed training simples.

    • scoresDataFrame of shape (n_samples,)

      The total scores of individuals.

    • euclDataFrame of shape (n_samples, n_classes)

      The squared Euclidean distance to origin.

    • genDataFrame shape (n_samples, n_classes)

      The generalized squared distance to origin.

  • model_ (str, default = ‘cpls’) – The model fitted name.

  • var_ (NamedTuple) – Variables informations:

    • weightsDataFrame of shape (n_features, n_components)

      The left singular vectors of the cross-covariance matrices of each iteration.

    • loadingsDataFrame of shape (n_features, n_components)

      The loadings of X.

    • rotationsDataFrame of shape (n_features, n_components)

      The projection matrix used to transform X.

  • vip_ (NamedTuple) – Variable importance in projection information:

    • vipSeries of shape (n_features,)

      The variable importance in projection for partial least squares regression

    • selectedlist

      Selected features

See also

fviz_plsr

Visualize Partial Least Squares Regression.

fviz_plsr_ind

Visualize Partial Least Squares Regression - Graph of individuals.

fviz_plsr_var

Visualize Partial Least Squares Regression - Graph of variables.

fviz_dist

Visualize distance between barycenter.

summaryCPLS

Printing summaries of Partial Least Squares for Classification model.

summaryDA

Printing summaries of Discriminant Analysis model.

References

[1] Abdi, H (2007), « Partial Least Square Regression »

[2] Garson, « Partial Least Squares Regression (PLS) »

[3] M. Tenenhaus (1998), « La régression PLS - Théorie et Pratique », Technip.

[4] M. Tenenhaus (1995), « Régression PLS et applications », _Revue de statistique appliquée, tome 43, n°1, p. 7-63

[5] R. Tomassone, M. Danzart, J.J. Daudin, J.P. Masson, « Discrimination et classement », Masson, 1988

[6] Ricco Rakotomalala (2008), « Analyse Discriminante PLS », Université Lumière Lyon 2.

[7] Ricco Rakotomalala (2008), « Analyse Discriminante PLS - Etude comparative », Université Lumière Lyon 2.

[8] Ricco Rakotomalala (2008), « Régression PLS », Université Lumière Lyon 2.

[9] Ricco Rakotomalala (2008), « Régression PLS - Sélection du nombre d’axes », Université Lumière Lyon 2.

[10] Ricco Rakotomalala (2008), « Régression PLS - Comparaison de logiciels », Université Lumière Lyon 2.

[11] Ricco Rakotomalala (2020), Pratique de l’Analyse Discriminante Linéaire, Version 1.0, Université Lumière Lyon 2.

[12] S. Chevallier, D. Bertrand, A. Kohler, P. Courcoux (2006), « Application of PLS-DA in multivariate image analysis », in J. Chemometrics, 20 : 221-229.

[13] S. Vancolen (2004), « La régression PLS », Université de Neuchâtel.

Examples

>>> from discrimintools.datasets import load_dataset
>>> from discrimintools import CPLS
>>> D = load_dataset("breast") # load traning data
>>> y, X = D["Class"], D.drop(columns=["Class"]) # split into X and y
>>> clf = CPLS()
>>> clf.fit(X,y)
CPLS()
__init__(n_components=2, scale=True, classes=None, max_iter=500, tol=1e-10, var_select=False, threshold=1.0, warn_message=True)[source]#

Methods

__init__([n_components, scale, classes, ...])

decision_function(X)

Apply decision function to an input data

eval_predict(X, y[, verbose])

Evaluation of the prediction' quality

fit(X, y)

Fit Partial Least Squares for Classification Model

fit_transform(X, y)

Learn and apply the dimension reduction on the train data.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

inverse_transform(X)

Transform data back to its original space.

pred_table(X, y)

Prediction table

predict(X)

Predict class labels for samples in X

predict_log_proba(X)

Return log of posterior probabilities

predict_proba(X)

Estimate probability

score(X, y)

Return accuracy on the given input data

set_output(*[, transform])

Set output container.

set_params(**params)

Set the parameters of this estimator.

transform(X)

Apply the dimension reduction.