GFALDA DISQUAL - vote dataset#

[1]:
#disable warnings
from warnings import simplefilter, filterwarnings
simplefilter(action='ignore', category=FutureWarning)
filterwarnings("ignore")

vote dataset#

[2]:
#vins dataset
from discrimintools.datasets import load_vote
D = load_vote("subset")
D.info()
<class 'pandas.core.frame.DataFrame'>
Index: 435 entries, 0 to 434
Data columns (total 5 columns):
 #   Column                      Non-Null Count  Dtype
---  ------                      --------------  -----
 0   handicapped_infants         435 non-null    object
 1   water_project_cost_sharin   435 non-null    object
 2   adoption_of_the_budget_res  435 non-null    object
 3   physician_fee_freeze        435 non-null    object
 4   group                       435 non-null    object
dtypes: object(5)
memory usage: 20.4+ KB
[3]:
#split into X and y
y, X = D["group"], D.drop(columns=["group"])

Instanciation and training#

[4]:
#instanciation and training
from discrimintools import GFALDA
clf = GFALDA(n_components=2)
clf.fit(X,y)
[4]:
GFALDA()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Canonical coefficients#

[5]:
#canonical coefficients
cancoef = clf.cancoef_
cancoef._fields
[5]:
('standardized', 'raw', 'projection')

Standardized canonical coefficients#

[6]:
#standardized canonical coefficients
print(cancoef.standardized)
                                      Can1      Can2
handicapped_infants_n            -0.527352  0.710502
handicapped_infants_other         5.522611  2.600607
handicapped_infants_y             0.311144 -1.063560
water_project_cost_sharin_n      -0.240703 -0.294945
water_project_cost_sharin_other   1.881943  1.074250
water_project_cost_sharin_y      -0.226247  0.025977
adoption_of_the_budget_res_n     -0.877486  1.243859
adoption_of_the_budget_res_other  5.926929  3.335334
adoption_of_the_budget_res_y      0.335391 -0.985725
physician_fee_freeze_n            0.344976 -1.018166
physician_fee_freeze_other        5.987091  2.959171
physician_fee_freeze_y           -0.853487  1.236927

Pojection canonical coefficients#

[7]:
#projection canonical coefficients
print(cancoef.projection)
                                      Can1      Can2
handicapped_infants_n            -0.131838  0.177625
handicapped_infants_other         1.380653  0.650152
handicapped_infants_y             0.077786 -0.265890
water_project_cost_sharin_n      -0.060176 -0.073736
water_project_cost_sharin_other   0.470486  0.268562
water_project_cost_sharin_y      -0.056562  0.006494
adoption_of_the_budget_res_n     -0.219371  0.310965
adoption_of_the_budget_res_other  1.481732  0.833834
adoption_of_the_budget_res_y      0.083848 -0.246431
physician_fee_freeze_n            0.086244 -0.254541
physician_fee_freeze_other        1.496773  0.739793
physician_fee_freeze_y           -0.213372  0.309232

Raw canonical coefficients#

[8]:
#raw canonical coefficients
print(cancoef.raw)
                                        Can1        Can2
Constant                          -17.584809   -9.824230
handicapped_infants_n              -0.972027    1.309611
handicapped_infants_other         200.194660   94.272001
handicapped_infants_y               0.723783   -2.474056
water_project_cost_sharin_n        -0.545343   -0.668236
water_project_cost_sharin_other    17.055104    9.735387
water_project_cost_sharin_y        -0.504706    0.057949
adoption_of_the_budget_res_n       -2.232200    3.164202
adoption_of_the_budget_res_other  234.383111  131.897318
adoption_of_the_budget_res_y        0.576660   -1.694824
physician_fee_freeze_n              0.607550   -1.793126
physician_fee_freeze_other        236.762232  117.021769
physician_fee_freeze_y             -2.097553    3.039905

Coefficients#

[9]:
#coefficients
coef = clf.coef_
coef._fields
[9]:
('standardized', 'raw', 'projection')

Standardized coefficients#

[10]:
#standardized coefficients
print(coef.standardized)
                                  democrat  republican
Constant                         -1.304482   -3.013431
handicapped_infants_n            -2.959729    4.703855
handicapped_infants_other         1.185292   -1.883767
handicapped_infants_y             3.659211   -5.815532
water_project_cost_sharin_n       0.487460   -0.774713
water_project_cost_sharin_other  -0.154333    0.245279
water_project_cost_sharin_y      -0.441971    0.702418
adoption_of_the_budget_res_n     -5.107770    8.117707
adoption_of_the_budget_res_other -0.343928    0.546600
adoption_of_the_budget_res_y      3.467241   -5.510436
physician_fee_freeze_n            3.579007   -5.688065
physician_fee_freeze_other        0.869829   -1.382407
physician_fee_freeze_y           -5.048491    8.023495

Projection coefficients#

[11]:
#projection coefficients
print(coef.projection)
                                  democrat  republican
Constant                         -1.304482   -3.013431
handicapped_infants_n            -0.739932    1.175964
handicapped_infants_other         0.296323   -0.470942
handicapped_infants_y             0.914803   -1.453883
water_project_cost_sharin_n       0.121865   -0.193678
water_project_cost_sharin_other  -0.038583    0.061320
water_project_cost_sharin_y      -0.110493    0.175604
adoption_of_the_budget_res_n     -1.276943    2.029427
adoption_of_the_budget_res_other -0.085982    0.136650
adoption_of_the_budget_res_y      0.866810   -1.377609
physician_fee_freeze_n            0.894752   -1.422016
physician_fee_freeze_other        0.217457   -0.345602
physician_fee_freeze_y           -1.262123    2.005874

Raw coefficients#

[12]:
#raw coefficients
print(coef.raw)
                                   democrat  republican
Constant                          -0.496301   -4.297863
handicapped_infants_n             -5.455432    8.670241
handicapped_infants_other         42.966824  -68.286560
handicapped_infants_y              8.512069  -13.528109
water_project_cost_sharin_n        1.104401   -1.755208
water_project_cost_sharin_other   -1.398640    2.222839
water_project_cost_sharin_y       -0.985935    1.566932
adoption_of_the_budget_res_n     -12.993451   20.650306
adoption_of_the_budget_res_other -13.600791   21.615544
adoption_of_the_budget_res_y       5.961462   -9.474466
physician_fee_freeze_n             6.303110  -10.017443
physician_fee_freeze_other        34.397788  -54.667912
physician_fee_freeze_y           -12.407308   19.718758

Summary#

[13]:
#summary
from discrimintools import summaryGFALDA
summaryGFALDA(clf,detailed=True)
                     General Factor Analysis Linear Discriminant Analysis - Results

Class Level Information:
            Frequency  Proportion  Prior Probability
democrat          267      0.6138             0.6138
republican        168      0.3862             0.3862

Importance of components:
      Eigenvalue  Difference  Proportion (%)  Cumulative (%)
Can1      0.5323      0.0242         26.6143         26.6143
Can2      0.5081      0.2534         25.4039         52.0182

Raw Canonical Coefficients:
                                      Can1      Can2
Constant                          -17.5848   -9.8242
handicapped_infants_n              -0.9720    1.3096
handicapped_infants_other         200.1947   94.2720
handicapped_infants_y               0.7238   -2.4741
water_project_cost_sharin_n        -0.5453   -0.6682
water_project_cost_sharin_other    17.0551    9.7354
water_project_cost_sharin_y        -0.5047    0.0579
adoption_of_the_budget_res_n       -2.2322    3.1642
adoption_of_the_budget_res_other  234.3831  131.8973
adoption_of_the_budget_res_y        0.5767   -1.6948
physician_fee_freeze_n              0.6075   -1.7931
physician_fee_freeze_other        236.7622  117.0218
physician_fee_freeze_y             -2.0976    3.0399

Projection functions coefficients:
                                    Can1    Can2
handicapped_infants_n            -0.1318  0.1776
handicapped_infants_other         1.3807  0.6502
handicapped_infants_y             0.0778 -0.2659
water_project_cost_sharin_n      -0.0602 -0.0737
water_project_cost_sharin_other   0.4705  0.2686
water_project_cost_sharin_y      -0.0566  0.0065
adoption_of_the_budget_res_n     -0.2194  0.3110
adoption_of_the_budget_res_other  1.4817  0.8338
adoption_of_the_budget_res_y      0.0838 -0.2464
physician_fee_freeze_n            0.0862 -0.2545
physician_fee_freeze_other        1.4968  0.7398
physician_fee_freeze_y           -0.2134  0.3092

Multivariate Analysis of Variance (MANOVA) Summary:
          Statistic     Value  p-value
0     Wilks' Lambda    0.2772      NaN
1  Bartlett -- C(2)  554.1935      0.0
2   Rao -- F(2,432)  563.0956      0.0

LDA Classification functions & Statistical Evaluation:
          democrat  republican  Wilks' Lambda  Partial R-Square   F Value  \
Constant   -1.3045     -3.0134            NaN               NaN       NaN
Can1        1.6126     -2.5629         0.4479            0.6190  265.9263
Can2       -2.9688      4.7182         0.8293            0.3343  860.2649

          Num DF  Den DF  Pr>F
Constant     NaN     NaN   NaN
Can1         1.0   432.0   0.0
Can2         1.0   432.0   0.0

Classification Summary for Calibration Data:

Observation Profile:
                        Read  Used
Number of Observations   435   435

Number of Observations Classified into group:
prediction  democrat  republican  Total
group
democrat         244          23    267
republican        14         154    168
Total            258         177    435

Percent Classified into group:
prediction  democrat  republican  Total
group
democrat     91.3858      8.6142  100.0
republican    8.3333     91.6667  100.0
Total        59.3103     40.6897  100.0
Priors        0.6138      0.3862    NaN

Error Count Estimates for group:
        democrat  republican   Total
Rate      0.0861      0.0833  0.0851
Priors    0.6138      0.3862     NaN

Classification Report for group:
              precision  recall  f1-score   support
democrat         0.9457  0.9139    0.9295  267.0000
republican       0.8701  0.9167    0.8928  168.0000
accuracy         0.9149  0.9149    0.9149    0.9149
macro avg        0.9079  0.9153    0.9111  435.0000
weighted avg     0.9165  0.9149    0.9153  435.0000