GFALDA DISQUAL - vote dataset#
[1]:
#disable warnings
from warnings import simplefilter, filterwarnings
simplefilter(action='ignore', category=FutureWarning)
filterwarnings("ignore")
vote dataset#
[2]:
#vins dataset
from discrimintools.datasets import load_vote
D = load_vote("subset")
D.info()
<class 'pandas.core.frame.DataFrame'>
Index: 435 entries, 0 to 434
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 handicapped_infants 435 non-null object
1 water_project_cost_sharin 435 non-null object
2 adoption_of_the_budget_res 435 non-null object
3 physician_fee_freeze 435 non-null object
4 group 435 non-null object
dtypes: object(5)
memory usage: 20.4+ KB
[3]:
#split into X and y
y, X = D["group"], D.drop(columns=["group"])
Instanciation and training#
[4]:
#instanciation and training
from discrimintools import GFALDA
clf = GFALDA(n_components=2)
clf.fit(X,y)
[4]:
GFALDA()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
| n_components | 2 | |
| priors | None | |
| classes | False |
Canonical coefficients#
[5]:
#canonical coefficients
cancoef = clf.cancoef_
cancoef._fields
[5]:
('standardized', 'raw', 'projection')
Standardized canonical coefficients#
[6]:
#standardized canonical coefficients
print(cancoef.standardized)
Can1 Can2
handicapped_infants_n -0.527352 0.710502
handicapped_infants_other 5.522611 2.600607
handicapped_infants_y 0.311144 -1.063560
water_project_cost_sharin_n -0.240703 -0.294945
water_project_cost_sharin_other 1.881943 1.074250
water_project_cost_sharin_y -0.226247 0.025977
adoption_of_the_budget_res_n -0.877486 1.243859
adoption_of_the_budget_res_other 5.926929 3.335334
adoption_of_the_budget_res_y 0.335391 -0.985725
physician_fee_freeze_n 0.344976 -1.018166
physician_fee_freeze_other 5.987091 2.959171
physician_fee_freeze_y -0.853487 1.236927
Pojection canonical coefficients#
[7]:
#projection canonical coefficients
print(cancoef.projection)
Can1 Can2
handicapped_infants_n -0.131838 0.177625
handicapped_infants_other 1.380653 0.650152
handicapped_infants_y 0.077786 -0.265890
water_project_cost_sharin_n -0.060176 -0.073736
water_project_cost_sharin_other 0.470486 0.268562
water_project_cost_sharin_y -0.056562 0.006494
adoption_of_the_budget_res_n -0.219371 0.310965
adoption_of_the_budget_res_other 1.481732 0.833834
adoption_of_the_budget_res_y 0.083848 -0.246431
physician_fee_freeze_n 0.086244 -0.254541
physician_fee_freeze_other 1.496773 0.739793
physician_fee_freeze_y -0.213372 0.309232
Raw canonical coefficients#
[8]:
#raw canonical coefficients
print(cancoef.raw)
Can1 Can2
Constant -17.584809 -9.824230
handicapped_infants_n -0.972027 1.309611
handicapped_infants_other 200.194660 94.272001
handicapped_infants_y 0.723783 -2.474056
water_project_cost_sharin_n -0.545343 -0.668236
water_project_cost_sharin_other 17.055104 9.735387
water_project_cost_sharin_y -0.504706 0.057949
adoption_of_the_budget_res_n -2.232200 3.164202
adoption_of_the_budget_res_other 234.383111 131.897318
adoption_of_the_budget_res_y 0.576660 -1.694824
physician_fee_freeze_n 0.607550 -1.793126
physician_fee_freeze_other 236.762232 117.021769
physician_fee_freeze_y -2.097553 3.039905
Coefficients#
[9]:
#coefficients
coef = clf.coef_
coef._fields
[9]:
('standardized', 'raw', 'projection')
Standardized coefficients#
[10]:
#standardized coefficients
print(coef.standardized)
democrat republican
Constant -1.304482 -3.013431
handicapped_infants_n -2.959729 4.703855
handicapped_infants_other 1.185292 -1.883767
handicapped_infants_y 3.659211 -5.815532
water_project_cost_sharin_n 0.487460 -0.774713
water_project_cost_sharin_other -0.154333 0.245279
water_project_cost_sharin_y -0.441971 0.702418
adoption_of_the_budget_res_n -5.107770 8.117707
adoption_of_the_budget_res_other -0.343928 0.546600
adoption_of_the_budget_res_y 3.467241 -5.510436
physician_fee_freeze_n 3.579007 -5.688065
physician_fee_freeze_other 0.869829 -1.382407
physician_fee_freeze_y -5.048491 8.023495
Projection coefficients#
[11]:
#projection coefficients
print(coef.projection)
democrat republican
Constant -1.304482 -3.013431
handicapped_infants_n -0.739932 1.175964
handicapped_infants_other 0.296323 -0.470942
handicapped_infants_y 0.914803 -1.453883
water_project_cost_sharin_n 0.121865 -0.193678
water_project_cost_sharin_other -0.038583 0.061320
water_project_cost_sharin_y -0.110493 0.175604
adoption_of_the_budget_res_n -1.276943 2.029427
adoption_of_the_budget_res_other -0.085982 0.136650
adoption_of_the_budget_res_y 0.866810 -1.377609
physician_fee_freeze_n 0.894752 -1.422016
physician_fee_freeze_other 0.217457 -0.345602
physician_fee_freeze_y -1.262123 2.005874
Raw coefficients#
[12]:
#raw coefficients
print(coef.raw)
democrat republican
Constant -0.496301 -4.297863
handicapped_infants_n -5.455432 8.670241
handicapped_infants_other 42.966824 -68.286560
handicapped_infants_y 8.512069 -13.528109
water_project_cost_sharin_n 1.104401 -1.755208
water_project_cost_sharin_other -1.398640 2.222839
water_project_cost_sharin_y -0.985935 1.566932
adoption_of_the_budget_res_n -12.993451 20.650306
adoption_of_the_budget_res_other -13.600791 21.615544
adoption_of_the_budget_res_y 5.961462 -9.474466
physician_fee_freeze_n 6.303110 -10.017443
physician_fee_freeze_other 34.397788 -54.667912
physician_fee_freeze_y -12.407308 19.718758
Summary#
[13]:
#summary
from discrimintools import summaryGFALDA
summaryGFALDA(clf,detailed=True)
General Factor Analysis Linear Discriminant Analysis - Results
Class Level Information:
Frequency Proportion Prior Probability
democrat 267 0.6138 0.6138
republican 168 0.3862 0.3862
Importance of components:
Eigenvalue Difference Proportion (%) Cumulative (%)
Can1 0.5323 0.0242 26.6143 26.6143
Can2 0.5081 0.2534 25.4039 52.0182
Raw Canonical Coefficients:
Can1 Can2
Constant -17.5848 -9.8242
handicapped_infants_n -0.9720 1.3096
handicapped_infants_other 200.1947 94.2720
handicapped_infants_y 0.7238 -2.4741
water_project_cost_sharin_n -0.5453 -0.6682
water_project_cost_sharin_other 17.0551 9.7354
water_project_cost_sharin_y -0.5047 0.0579
adoption_of_the_budget_res_n -2.2322 3.1642
adoption_of_the_budget_res_other 234.3831 131.8973
adoption_of_the_budget_res_y 0.5767 -1.6948
physician_fee_freeze_n 0.6075 -1.7931
physician_fee_freeze_other 236.7622 117.0218
physician_fee_freeze_y -2.0976 3.0399
Projection functions coefficients:
Can1 Can2
handicapped_infants_n -0.1318 0.1776
handicapped_infants_other 1.3807 0.6502
handicapped_infants_y 0.0778 -0.2659
water_project_cost_sharin_n -0.0602 -0.0737
water_project_cost_sharin_other 0.4705 0.2686
water_project_cost_sharin_y -0.0566 0.0065
adoption_of_the_budget_res_n -0.2194 0.3110
adoption_of_the_budget_res_other 1.4817 0.8338
adoption_of_the_budget_res_y 0.0838 -0.2464
physician_fee_freeze_n 0.0862 -0.2545
physician_fee_freeze_other 1.4968 0.7398
physician_fee_freeze_y -0.2134 0.3092
Multivariate Analysis of Variance (MANOVA) Summary:
Statistic Value p-value
0 Wilks' Lambda 0.2772 NaN
1 Bartlett -- C(2) 554.1935 0.0
2 Rao -- F(2,432) 563.0956 0.0
LDA Classification functions & Statistical Evaluation:
democrat republican Wilks' Lambda Partial R-Square F Value \
Constant -1.3045 -3.0134 NaN NaN NaN
Can1 1.6126 -2.5629 0.4479 0.6190 265.9263
Can2 -2.9688 4.7182 0.8293 0.3343 860.2649
Num DF Den DF Pr>F
Constant NaN NaN NaN
Can1 1.0 432.0 0.0
Can2 1.0 432.0 0.0
Classification Summary for Calibration Data:
Observation Profile:
Read Used
Number of Observations 435 435
Number of Observations Classified into group:
prediction democrat republican Total
group
democrat 244 23 267
republican 14 154 168
Total 258 177 435
Percent Classified into group:
prediction democrat republican Total
group
democrat 91.3858 8.6142 100.0
republican 8.3333 91.6667 100.0
Total 59.3103 40.6897 100.0
Priors 0.6138 0.3862 NaN
Error Count Estimates for group:
democrat republican Total
Rate 0.0861 0.0833 0.0851
Priors 0.6138 0.3862 NaN
Classification Report for group:
precision recall f1-score support
democrat 0.9457 0.9139 0.9295 267.0000
republican 0.8701 0.9167 0.8928 168.0000
accuracy 0.9149 0.9149 0.9149 0.9149
macro avg 0.9079 0.9153 0.9111 435.0000
weighted avg 0.9165 0.9149 0.9153 435.0000