discrimintools.GFA#

class discrimintools.GFA(n_components=2)[source]#

General Factor Analysis (GFA)

General factor analysis refers to dimensionality reduction technique such as PCA, CA, MCA, FAMD which helps to reduce the number of features in a dataset while keeping the most important informations. For more about generalized factor analysis, see scientisttools.

Parameters:

n_components (int or None, default = 2) – Number of components to keep. If None, keep all the components.

Returns:

call_ (NamedTuple) – Call informations:
- XtotDataFrame of shape (n_samples, n_columns)
  
  Input data.
- XDataFrame of shape (n_samples, n_features)
  
  Training data.
- dummiesNone or DataFrame
  
  Disjunctive data.
- k1int.
  
  Number of numerics columns.
- k2int.
  
  Number of categorical columns.
- ZDataFrame of shape (n_samples, n_vars)
  
  Standardize data.
- ZcDataFrame of shape (n_samples, n_vars)
  
  Csentered standardize data.
- ind_weightsSeries of shape (n_samples,)
  
  Individuals weights.
- var_weightsSeries of shape (n_vars,)
  
  Variables weights.
- centerSeries of shape (n_vars,)
  
  Mean of variables.
- z_centerSeries of shape (n_vars,)
  
  Mean of recoding variables.
- scaleSeries of shape (n_vars,)
  
  Scale of variables.
- denomSeries of shape (n_vars,)
  
  Number of variables.
- max_componentsint.
  
  Maximum number of components.
- n_componentsint:
  
  Number of components kept.
eig_ (DataFrame of shape (max_components, 4)) – The eigenvalues, the difference between each eigenvalue, the percentage of variance and the cumulative percentage of variance.
ind_ (NamedTuple) – Individuals informations:
- coordDataFrame of shape (n_samples, n_components)
  
  The individuals coordinates.
model_ (str, default = ‘gfa’) – The model fitted.
svd_ (NamedTuple) – Generalized singular values decomposition:
- vs1-D array of shape (max_components,)
  
  The singular values.
- U2-D array of shape (n_samples, n_components)
  
  The left singular vectors.
- V2-D array of shape (n_vars, n_components)
  
  The right singular vectors.
var_ (NamedTuple) – Variables informations:
- coordDataFrame of shape (n_vars, n_components)
  
  The variables coordinates.

See also

GFALDA: General Factor Analysis Linear Discriminant Analysis (GFALDA)
MDA: Mixed Discriminant Analysis (MDA)
MPCA: Mixed Principal Component Analysis (MPCA)
summaryGFA: Printing summaries of General Factor Analysis model.
summaryGFALDA: Printing summaries of General Factor Analysis Linear Discriminant Analysis model.
summaryMDA: Printing summaries of Mixed Discriminant Analysis model.
summaryMPCA: Printing summaries of Mixed Principal Component Analysis model.

References

[1] Bry X. (1996), « Analyses factorielles multiple », Economica.

[2] Bry X. (1999), « Analyses factorielles simples », Economica.

[3] Escofier B., Pagès J. (2023), « Analyses Factorielles Simples et Multiples », 5ed, Dunod

[5] Husson, F., Le, S. and Pages, J. (2010), « Exploratory Multivariate Analysis by Example Using R, Chapman and Hall.

[6] Lebart Ludovic, Piron Marie, & Morineau Alain (2006), « Statistique Exploratoire Multidimensionnelle », Dunod, Paris 4ed.

[7] Pagès J. (2013), « Analyse factorielle multiple avec R : Pratique R », EDP sciences

[8] Rakotomalala, R. (2020), « Pratique des Méthodes Factorielles avec Python », Université Lumière Lyon 2. Version 1.0.

[8] Saporta Gilbert (2011), « Probabilités, Analyse des données et Statistiques », Editions TECHNIP, 3ed.

[9] Tenenhaus, M. (2006), « Statistique : Méthodes pour décrire, expliquer et prévoir. Dunod.

Examples

>>> from discrimintools.datasets import load_alcools, load_canines, load_heart
>>> from discrimintools import GFA
>>> #principal components analysis (PCA)
>>> D = load_alcools("train") # load training data
>>> X = D.drop(columns=["TYPE"]) # extract X
>>> clf = GFA()
>>> clf.fit(X)
GFA()
>>> #multiple correspondence analysis (MCA)
>>> D = load_canines("train") # load training data
>>> X = D.drop(columns=["Fonction"]) # extract X
>>> clf = GFA()
>>> clf.fit(X)
GFA()
>>> #factor analysis of mixed data (FAMD)
>>> D = load_heart("subset") # load subset data
>>> X = D.drop(columns=["disease"]) # extract X
>>> clf = GFA()
>>> clf.fit(X)
GFA()

__init__(n_components=2)[source]#

Methods

`__init__`([n_components])
`fit`(X[, y])	Fit the General Factor Analysis Model
`fit_transform`(X[, y])	Fit the model with X and apply the dimensionality reduction on X
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`set_output`(*[, transform])	Set output container.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(X)	Apply the dimensionality reduction on X

discrimintools.GFA#

This Page