discrimintools.STEPDISC#
- class discrimintools.STEPDISC(method='forward', alpha=0.01, lambda_init=None, verbose=True)[source]#
-
Stepwise Discriminant Analysis (STEPDISC)
Given a classification variable and several quantitative variables, the
STEPDISCclass performs a stepwise discriminant analysis to select a subset of the quantitative variables for use in discriminating among the classes. The set of variables that make up each class is assumed to be multivariate normal with a common covariance matrix. TheSTEPDISCclass can use forward selection and backward elimination, which is a useful prelude to further analyses with theCANDISCclass or theDISCRIMclass.With
STEPDISC, variables are chosen to enter or leave the model according to the significance level of an F test from an analysis of covariance, where the variables already chosen act as covariates and the variable under consideration is the dependent variable. Two selection methods are available: ‘forward’ and ‘backward’:Forward selection begins with no variables in the model. At each step,
STEPDISCenters the variable that contributes most to the discriminatory power of the model as measured by Wilks’ lambda, the likelihood ratio criterion. When none of the unselected variables meet the entry criterion, the forward selection process stops.Backward elimination begins with all variables in the model except those that are linearly dependent on previous variables in the VAR statement. At each step, the variable that contributes least to the discriminatory power of the model as measured by Wilks’ lambda is removed. When all remaining variables meet the criterion to stay in the model, the backward elimination process stops.
- Parameters:
-
method ({‘backward’,’forward’}, default=’forward’) – The feature selection method to be used, possible values: - “forward” for forward selection, - “backward” for backward elimination
alpha (float, default = 1e-2) – The significance level for adding or retaining variables in stepwise variable selection.
lambda_init (None or float, default = None) – Initial Wilks Lambda.
verbose (bool, default=True) – If True, print intermediary steps during feature selection (default)
- Returns:
-
-
call_ (NamedTuple) – Call informations:
-
- objclass
-
An object of class CANDISC, DISCRIM
-
- alphafloat
-
The significance level for adding or retaining variables in stepwise variable selection.
-
- targetstr
-
Name of target.
-
- classeslist
-
Names of classes
-
- priorsSeries of shape (n_classes,)
-
Priors probabilities.
-
disc_ (class) – An object of class CANDISC or DISCRIM
model_ (str, default = “stepdisc”) – Name of model fitted.
-
summary_ (NamedTuple) – Stepwise summary informations:
-
- summaryDataFrame of shape (n_selected, 6)
-
Summary of stepwise selection
-
- selectedlist
-
Selected variables
-
- removedlist
-
Removed variables
-
-
See also
CANDISC-
Canonical Discriminant Analysis (CANDISC)
DISCRIM-
Discriminant Analysis (linear and quadratic).
summaryCANDISC-
Printing summaries of Canonical Discriminant Analysis model.
summaryDISCRIM-
Printing summaries of Discriminant Analysis (linear and quadratic) model.
References
[1] Ricco Rakotomalala (2008), « STEPDISC - Feature selection for LDA », Université Lumière Lyon 2.
[2] Ricco Rakotomalala (2012), « Linear Discriminant Analysis - Tools comparison », Université Lumière Lyon 2.
[3] Ricco Rakotomalala (2014), « Linear discriminant analysis (slides) », Université Lumière Lyon 2.
[4] Ricco Rakotomalala (2020), « Pratique de l’Analyse Discriminante Linéaire », Version 1.0, Université Lumière Lyon 2.
[5] SAS/STAT 13.1 User’s Guide (2013), « The STEPDISC Procedure », Chapter 93.
Examples
>>> from discrimintools.datasets import load_heart >>> from discrimintools import DISCRIM, STEPDISC >>> D = load_heart("train") # load training data >>> y, X = D["disease"], D.drop(columns=["disease"]) # split into X and y >>> clf = DISCRIM(method="linear") >>> clf.fit(X,y) >>> clf2 = STEPDISC(method="forward",alpha=0.01,verbose=True) >>> clf2.fit(clf) STEPDISC()
Methods
__init__([method, alpha, lambda_init, verbose])decision_function(X)Apply decision function to an input data
eval_predict(X, y[, verbose])Evaluation of the prediction' quality
fit(obj)Fit Stepwise Discriminant Analysis procedure
fit_transform(obj)Fits transformer to
Xand returns a transformed version of samples.get_metadata_routing()Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
pred_table(X, y)Prediction table
predict(X)Predict class labels for samples in X
predict_log_proba(X)Return log of posterior probabilities
predict_proba(X)Estimate probability
score(X, y)Return accuracy on the given input data
set_fit_request(*[, obj])Configure whether metadata should be requested to be passed to the
fitmethod.set_output(*[, transform])Set output container.
set_params(**params)Set the parameters of this estimator.
transform(X)Project data to maximize class separation or dimensionality reduction