discrimintools.GFA#
- class discrimintools.GFA(n_components=2)[source]#
-
General Factor Analysis (GFA)
General factor analysis refers to dimensionality reduction technique such as PCA, CA, MCA, FAMD which helps to reduce the number of features in a dataset while keeping the most important informations. For more about generalized factor analysis, see scientisttools.
- Parameters:
-
n_components (int or None, default = 2) – Number of components to keep. If None, keep all the components.
- Returns:
-
-
call_ (NamedTuple) – Call informations:
-
- XtotDataFrame of shape (n_samples, n_columns)
-
Input data.
-
- XDataFrame of shape (n_samples, n_features)
-
Training data.
-
- dummiesNone or DataFrame
-
Disjunctive data.
-
- k1int.
-
Number of numerics columns.
-
- k2int.
-
Number of categorical columns.
-
- ZDataFrame of shape (n_samples, n_vars)
-
Standardize data.
-
- ZcDataFrame of shape (n_samples, n_vars)
-
Csentered standardize data.
-
- ind_weightsSeries of shape (n_samples,)
-
Individuals weights.
-
- var_weightsSeries of shape (n_vars,)
-
Variables weights.
-
- centerSeries of shape (n_vars,)
-
Mean of variables.
-
- z_centerSeries of shape (n_vars,)
-
Mean of recoding variables.
-
- scaleSeries of shape (n_vars,)
-
Scale of variables.
-
- denomSeries of shape (n_vars,)
-
Number of variables.
-
- max_componentsint.
-
Maximum number of components.
-
- n_componentsint:
-
Number of components kept.
-
eig_ (DataFrame of shape (max_components, 4)) – The eigenvalues, the difference between each eigenvalue, the percentage of variance and the cumulative percentage of variance.
-
ind_ (NamedTuple) – Individuals informations:
-
- coordDataFrame of shape (n_samples, n_components)
-
The individuals coordinates.
-
model_ (str, default = ‘gfa’) – The model fitted.
-
svd_ (NamedTuple) – Generalized singular values decomposition:
-
- vs1-D array of shape (max_components,)
-
The singular values.
-
- U2-D array of shape (n_samples, n_components)
-
The left singular vectors.
-
- V2-D array of shape (n_vars, n_components)
-
The right singular vectors.
-
-
var_ (NamedTuple) – Variables informations:
-
- coordDataFrame of shape (n_vars, n_components)
-
The variables coordinates.
-
-
See also
GFALDA-
General Factor Analysis Linear Discriminant Analysis (GFALDA)
MDA-
Mixed Discriminant Analysis (MDA)
MPCA-
Mixed Principal Component Analysis (MPCA)
summaryGFA-
Printing summaries of General Factor Analysis model.
summaryGFALDA-
Printing summaries of General Factor Analysis Linear Discriminant Analysis model.
summaryMDA-
Printing summaries of Mixed Discriminant Analysis model.
summaryMPCA-
Printing summaries of Mixed Principal Component Analysis model.
References
[1] Bry X. (1996), « Analyses factorielles multiple », Economica.
[2] Bry X. (1999), « Analyses factorielles simples », Economica.
[3] Escofier B., Pagès J. (2023), « Analyses Factorielles Simples et Multiples », 5ed, Dunod
[5] Husson, F., Le, S. and Pages, J. (2010), « Exploratory Multivariate Analysis by Example Using R, Chapman and Hall.
[6] Lebart Ludovic, Piron Marie, & Morineau Alain (2006), « Statistique Exploratoire Multidimensionnelle », Dunod, Paris 4ed.
[7] Pagès J. (2013), « Analyse factorielle multiple avec R : Pratique R », EDP sciences
[8] Rakotomalala, R. (2020), « Pratique des Méthodes Factorielles avec Python », Université Lumière Lyon 2. Version 1.0.
[8] Saporta Gilbert (2011), « Probabilités, Analyse des données et Statistiques », Editions TECHNIP, 3ed.
[9] Tenenhaus, M. (2006), « Statistique : Méthodes pour décrire, expliquer et prévoir. Dunod.
Examples
>>> from discrimintools.datasets import load_alcools, load_canines, load_heart >>> from discrimintools import GFA >>> #principal components analysis (PCA) >>> D = load_alcools("train") # load training data >>> X = D.drop(columns=["TYPE"]) # extract X >>> clf = GFA() >>> clf.fit(X) GFA() >>> #multiple correspondence analysis (MCA) >>> D = load_canines("train") # load training data >>> X = D.drop(columns=["Fonction"]) # extract X >>> clf = GFA() >>> clf.fit(X) GFA() >>> #factor analysis of mixed data (FAMD) >>> D = load_heart("subset") # load subset data >>> X = D.drop(columns=["disease"]) # extract X >>> clf = GFA() >>> clf.fit(X) GFA()
Methods
__init__([n_components])fit(X[, y])Fit the General Factor Analysis Model
fit_transform(X[, y])Fit the model with X and apply the dimensionality reduction on X
get_metadata_routing()Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
set_output(*[, transform])Set output container.
set_params(**params)Set the parameters of this estimator.
transform(X)Apply the dimensionality reduction on X