DiCA - canines dataset#
[1]:
#disable warnings
from warnings import simplefilter, filterwarnings
simplefilter(action='ignore', category=FutureWarning)
filterwarnings("ignore")
divay data#
[2]:
#vins dataset
from discrimintools.datasets import load_canines
D = load_canines()
D.info()
<class 'pandas.core.frame.DataFrame'>
Index: 27 entries, Beauceron to Terre-Neuve
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Taille 27 non-null object
1 Poids 27 non-null object
2 Velocite 27 non-null object
3 Intelligence 27 non-null object
4 Affection 27 non-null object
5 Agressivite 27 non-null object
6 Fonction 27 non-null object
dtypes: object(7)
memory usage: 1.7+ KB
[3]:
#display
print(D)
Taille Poids Velocite Intelligence Affection Agressivite \
Chien
Beauceron Taille++ Poids+ Veloc++ Intell+ Affec+ Agress+
Basset Taille- Poids- Veloc- Intell- Affec- Agress+
Berger All Taille++ Poids+ Veloc++ Intell++ Affec+ Agress+
Boxer Taille+ Poids+ Veloc+ Intell+ Affec+ Agress+
Bull-Dog Taille- Poids- Veloc- Intell+ Affec+ Agress-
Bull-Mastif Taille++ Poids++ Veloc- Intell++ Affec- Agress+
Caniche Taille- Poids- Veloc+ Intell++ Affec+ Agress-
Chihuahua Taille- Poids- Veloc- Intell- Affec+ Agress-
Cocker Taille+ Poids- Veloc- Intell+ Affec+ Agress+
Colley Taille++ Poids+ Veloc++ Intell+ Affec+ Agress-
Dalmatien Taille+ Poids+ Veloc+ Intell+ Affec+ Agress-
Doberman Taille++ Poids+ Veloc++ Intell++ Affec- Agress+
Dogue All Taille++ Poids++ Veloc++ Intell- Affec- Agress+
Epag. Breton Taille+ Poids+ Veloc+ Intell++ Affec+ Agress-
Epag. Français Taille++ Poids+ Veloc+ Intell+ Affec- Agress-
Fox-Hound Taille++ Poids+ Veloc++ Intell- Affec- Agress+
Fox-Terrier Taille- Poids- Veloc+ Intell+ Affec+ Agress+
Gd Bleu Gasc Taille++ Poids+ Veloc+ Intell- Affec- Agress+
Labrador Taille+ Poids+ Veloc+ Intell+ Affec+ Agress-
Levrier Taille++ Poids+ Veloc++ Intell- Affec- Agress-
Mastiff Taille++ Poids++ Veloc- Intell- Affec- Agress+
Pekinois Taille- Poids- Veloc- Intell- Affec+ Agress-
Pointer Taille++ Poids+ Veloc++ Intell++ Affec- Agress-
St-Bernard Taille++ Poids++ Veloc- Intell+ Affec- Agress+
Setter Taille++ Poids+ Veloc++ Intell+ Affec- Agress-
Teckel Taille- Poids- Veloc- Intell+ Affec+ Agress-
Terre-Neuve Taille++ Poids++ Veloc- Intell+ Affec- Agress-
Fonction
Chien
Beauceron utilite
Basset chasse
Berger All utilite
Boxer compagnie
Bull-Dog compagnie
Bull-Mastif utilite
Caniche compagnie
Chihuahua compagnie
Cocker compagnie
Colley compagnie
Dalmatien compagnie
Doberman utilite
Dogue All utilite
Epag. Breton chasse
Epag. Français chasse
Fox-Hound chasse
Fox-Terrier compagnie
Gd Bleu Gasc chasse
Labrador chasse
Levrier chasse
Mastiff utilite
Pekinois compagnie
Pointer chasse
St-Bernard utilite
Setter chasse
Teckel compagnie
Terre-Neuve utilite
[4]:
#split into X and y
y, X = D["Fonction"], D.drop(columns=["Fonction"])
Instanciation & training#
[5]:
#instanciation and training
from discrimintools import DiCA
clf = DiCA(n_components=2)
`fit function#
[6]:
#fit function
clf.fit(X,y)
[6]:
DiCA()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
| n_components | 2 | |
| classes | None |
decision_function function#
[7]:
#decision_function function
print(clf.decision_function(X))
chasse compagnie utilite
Beauceron -1.244809 -1.446530 -1.365716
Basset -1.480667 -1.113999 -1.665444
Berger All -1.271030 -1.691342 -1.291778
Boxer -1.299015 -1.113514 -1.997701
Bull-Dog -2.268919 -1.096344 -2.672961
Bull-Mastif -2.174303 -2.577337 -1.338694
Caniche -1.766395 -1.012410 -2.495278
Chihuahua -1.919991 -1.011839 -2.447831
Cocker -1.989424 -1.083387 -2.160431
Colley -1.143651 -1.323120 -1.626166
Dalmatien -1.432369 -1.224616 -2.492662
Doberman -1.355346 -2.294889 -1.394638
Dogue All -1.740831 -2.536920 -1.276402
Epag. Breton -1.273821 -1.284657 -2.233954
Epag. Français -1.118632 -1.670110 -1.897469
Fox-Hound -1.266159 -2.251535 -1.529409
Fox-Terrier -1.972929 -1.022607 -2.440365
Gd Bleu Gasc -1.152390 -1.990544 -1.693418
Labrador -1.432369 -1.224616 -2.492662
Levrier -1.193967 -2.157090 -1.818824
Mastiff -1.862985 -2.311853 -1.251334
Pekinois -1.919991 -1.011839 -2.447831
Pointer -1.198674 -2.115964 -1.599573
St-Bernard -2.053236 -2.237680 -1.317787
Setter -1.122356 -1.821056 -1.623415
Teckel -2.268919 -1.096344 -2.672961
Terre-Neuve -1.623870 -1.786062 -1.250028
eval_predict function#
[8]:
#eval_predict function
evl = clf.eval_predict(X,y,verbose=True)
Observation Profile:
Read Used
Number of Observations 27 27
Number of Observations Classified into Fonction:
prediction chasse compagnie utilite Total
Fonction
chasse 7 2 0 9
compagnie 1 9 0 10
utilite 3 0 5 8
Total 11 11 5 27
Percent Classified into Fonction:
prediction chasse compagnie utilite Total
Fonction
chasse 77.777778 22.222222 0.000000 100.0
compagnie 10.000000 90.000000 0.000000 100.0
utilite 37.500000 0.000000 62.500000 100.0
Total 40.740741 40.740741 18.518519 100.0
Priors 0.333333 0.370370 0.296296 NaN
Error Count Estimates for Fonction:
chasse compagnie utilite Total
Rate 0.222222 0.10000 0.375000 0.222222
Priors 0.333333 0.37037 0.296296 NaN
Classification Report for Fonction:
precision recall f1-score support
chasse 0.636364 0.777778 0.700000 9.000000
compagnie 0.818182 0.900000 0.857143 10.000000
utilite 1.000000 0.625000 0.769231 8.000000
accuracy 0.777778 0.777778 0.777778 0.777778
macro avg 0.818182 0.767593 0.775458 27.000000
weighted avg 0.811448 0.777778 0.778714 27.000000
fit_transform function#
[9]:
#fit_transform function
print(clf.fit_transform(X,y))
Can1 Can2
Beauceron -0.231936 0.060037
Basset 0.241638 0.295881
Berger All -0.459741 0.032325
Boxer 0.413511 -0.224446
Bull-Dog 0.990876 0.523040
Bull-Mastif -0.963955 0.754635
Caniche 0.866811 0.039503
Chihuahua 0.863318 0.285415
Cocker 0.646090 0.581656
Colley 0.003587 -0.229940
Dalmatien 0.649034 -0.514423
Doberman -0.845897 -0.247187
Dogue All -1.040478 0.245472
Epag. Breton 0.421230 -0.542135
Epag. Français -0.102060 -0.666027
Fox-Hound -0.745651 -0.457100
Fox-Terrier 0.859092 0.357192
Gd Bleu Gasc -0.465142 -0.613675
Labrador 0.649034 -0.514423
Levrier -0.510127 -0.747077
Mastiff -0.863708 0.544722
Pekinois 0.863318 0.285415
Pointer -0.610374 -0.537164
St-Bernard -0.736151 0.782347
Setter -0.382570 -0.509452
Teckel 0.990876 0.523040
Terre-Neuve -0.500627 0.492370
pred_table function#
[10]:
#pred_table function
print(clf.pred_table(X,y))
prediction chasse compagnie utilite
Fonction
chasse 7 2 0
compagnie 1 9 0
utilite 3 0 5
predict function#
[11]:
#predict function
print(clf.predict(X))
Beauceron chasse
Basset compagnie
Berger All chasse
Boxer compagnie
Bull-Dog compagnie
Bull-Mastif utilite
Caniche compagnie
Chihuahua compagnie
Cocker compagnie
Colley chasse
Dalmatien compagnie
Doberman chasse
Dogue All utilite
Epag. Breton chasse
Epag. Français chasse
Fox-Hound chasse
Fox-Terrier compagnie
Gd Bleu Gasc chasse
Labrador compagnie
Levrier chasse
Mastiff utilite
Pekinois compagnie
Pointer chasse
St-Bernard utilite
Setter chasse
Teckel compagnie
Terre-Neuve utilite
Name: prediction, dtype: object
predict_log_proba function#
[12]:
#predict_log_prob function
print(clf.predict_log_proba(X))
chasse compagnie utilite
Beauceron -0.994524 -1.196246 -1.115432
Basset -1.186076 -0.819408 -1.370854
Berger All -0.969380 -1.389691 -0.990127
Boxer -0.993644 -0.808143 -1.692330
Bull-Dog -1.588810 -0.416235 -1.992852
Bull-Mastif -1.379901 -1.782935 -0.544292
Caniche -1.283126 -0.529141 -2.012009
Chihuahua -1.403548 -0.495396 -1.931389
Cocker -1.462633 -0.556596 -1.633640
Colley -0.897289 -1.076757 -1.379804
Dalmatien -0.946728 -0.738975 -2.007021
Doberman -0.855384 -1.794926 -0.894675
Dogue All -1.112579 -1.908669 -0.648150
Epag. Breton -0.863761 -0.874597 -1.823893
Epag. Français -0.710514 -1.261992 -1.489351
Fox-Hound -0.761669 -1.747045 -1.024919
Fox-Terrier -1.438210 -0.487888 -1.905646
Gd Bleu Gasc -0.700450 -1.538603 -1.241477
Labrador -0.946728 -0.738975 -2.007021
Levrier -0.650781 -1.613904 -1.275638
Mastiff -1.247556 -1.696424 -0.635905
Pekinois -1.403548 -0.495396 -1.931389
Pointer -0.727219 -1.644509 -1.128118
St-Bernard -1.365577 -1.550022 -0.630128
Setter -0.743422 -1.442122 -1.244481
Teckel -1.588810 -0.416235 -1.992852
Terre-Neuve -1.195008 -1.357200 -0.821167
predic_proba function#
[13]:
#predict_proba function
print(clf.predict_proba(X))
chasse compagnie utilite
Beauceron 0.369899 0.302327 0.327774
Basset 0.305417 0.440693 0.253890
Berger All 0.379318 0.249152 0.371529
Boxer 0.370225 0.445685 0.184090
Bull-Dog 0.204169 0.659525 0.136306
Bull-Mastif 0.251603 0.168144 0.580253
Caniche 0.277169 0.589111 0.133720
Chihuahua 0.245724 0.609330 0.144947
Cocker 0.231626 0.573157 0.195218
Colley 0.407674 0.340698 0.251628
Dalmatien 0.388008 0.477603 0.134388
Doberman 0.425120 0.166140 0.408740
Dogue All 0.328710 0.148278 0.523012
Epag. Breton 0.421574 0.417030 0.161396
Epag. Français 0.491391 0.283090 0.225519
Fox-Hound 0.466886 0.174288 0.358825
Fox-Terrier 0.237352 0.613921 0.148727
Gd Bleu Gasc 0.496362 0.214681 0.288957
Labrador 0.388008 0.477603 0.134388
Levrier 0.521638 0.199109 0.279253
Mastiff 0.287206 0.183338 0.529456
Pekinois 0.245724 0.609330 0.144947
Pointer 0.483251 0.193107 0.323642
St-Bernard 0.255233 0.212243 0.532523
Setter 0.475484 0.236426 0.288090
Teckel 0.204169 0.659525 0.136306
Terre-Neuve 0.302701 0.257380 0.439918
score function#
[14]:
#score function
print("Accurary = {}%".format(round(100*clf.score(X,y))))
Accurary = 78%
transform function#
[15]:
#transform function
print(clf.transform(X))
Can1 Can2
Beauceron -0.231936 0.060037
Basset 0.241638 0.295881
Berger All -0.459741 0.032325
Boxer 0.413511 -0.224446
Bull-Dog 0.990876 0.523040
Bull-Mastif -0.963955 0.754635
Caniche 0.866811 0.039503
Chihuahua 0.863318 0.285415
Cocker 0.646090 0.581656
Colley 0.003587 -0.229940
Dalmatien 0.649034 -0.514423
Doberman -0.845897 -0.247187
Dogue All -1.040478 0.245472
Epag. Breton 0.421230 -0.542135
Epag. Français -0.102060 -0.666027
Fox-Hound -0.745651 -0.457100
Fox-Terrier 0.859092 0.357192
Gd Bleu Gasc -0.465142 -0.613675
Labrador 0.649034 -0.514423
Levrier -0.510127 -0.747077
Mastiff -0.863708 0.544722
Pekinois 0.863318 0.285415
Pointer -0.610374 -0.537164
St-Bernard -0.736151 0.782347
Setter -0.382570 -0.509452
Teckel 0.990876 0.523040
Terre-Neuve -0.500627 0.492370
Canonical coefficients#
[16]:
#canonical coefficients
cancoef = clf.cancoef_
for i, k in enumerate(cancoef._fields):
print("\n{} coefficients".format(k))
print(cancoef[i].round(3))
standardized coefficients
Can1 Can2
Taille+ 1.046 -0.786
Taille++ -1.143 -0.019
Taille- 1.702 0.602
Poids+ -0.270 -1.403
Poids++ -2.039 2.812
Poids- 1.748 0.698
Veloc+ 0.792 -1.326
Veloc++ -0.892 -0.387
Veloc- 0.169 1.409
Intell+ 0.531 0.459
Intell++ -0.836 0.293
Intell- -0.235 -0.966
Affec+ 1.116 0.807
Affec- -1.201 -0.870
Agress+ -0.733 0.902
Agress- 0.680 -0.838
projection coefficients
Can1 Can2
Taille+ 0.174 -0.131
Taille++ -0.191 -0.003
Taille- 0.284 0.100
Poids+ -0.045 -0.234
Poids++ -0.340 0.469
Poids- 0.291 0.116
Veloc+ 0.132 -0.221
Veloc++ -0.149 -0.064
Veloc- 0.028 0.235
Intell+ 0.088 0.077
Intell++ -0.139 0.049
Intell- -0.039 -0.161
Affec+ 0.186 0.135
Affec- -0.200 -0.145
Agress+ -0.122 0.150
Agress- 0.113 -0.140
summary#
[17]:
#summary
from discrimintools import summaryDiCA
summaryDiCA(clf, detailed=True)
Discriminant Correspondence Analysis - Results
Class Level Information:
Frequency Proportion Prior Probability
chasse 9 0.3333 0.3333
compagnie 10 0.3704 0.3704
utilite 8 0.2963 0.2963
Importance of components:
Eigenvalue Difference Proportion (%) Cumulative (%)
Can1 0.3459 0.2274 74.4893 74.4893
Can2 0.1184 NaN 25.5107 100.0000
Canonical correlation:
Eigenvalue Total SS Eta Sq. Canonical Correlation
Can1 0.3459 12.7034 0.7351 0.8574
Can2 0.1184 6.1735 0.5180 0.7198
Classification (projection) coefficients:
Can1 Can2
Taille+ 0.1744 -0.1310
Taille++ -0.1905 -0.0031
Taille- 0.2837 0.1003
Poids+ -0.0451 -0.2339
Poids++ -0.3399 0.4687
Poids- 0.2913 0.1164
Veloc+ 0.1319 -0.2210
Veloc++ -0.1486 -0.0644
Veloc- 0.0282 0.2348
Intell+ 0.0884 0.0766
Intell++ -0.1394 0.0489
Intell- -0.0391 -0.1611
Affec+ 0.1859 0.1346
Affec- -0.2002 -0.1449
Agress+ -0.1221 0.1504
Agress- 0.1134 -0.1396
Classification Summary for Calibration Data:
Observation Profile:
Read Used
Number of Observations 27 27
Number of Observations Classified into Fonction:
prediction chasse compagnie utilite Total
Fonction
chasse 7 2 0 9
compagnie 1 9 0 10
utilite 3 0 5 8
Total 11 11 5 27
Percent Classified into Fonction:
prediction chasse compagnie utilite Total
Fonction
chasse 77.7778 22.2222 0.0000 100.0
compagnie 10.0000 90.0000 0.0000 100.0
utilite 37.5000 0.0000 62.5000 100.0
Total 40.7407 40.7407 18.5185 100.0
Priors 0.3333 0.3704 0.2963 NaN
Error Count Estimates for Fonction:
chasse compagnie utilite Total
Rate 0.2222 0.1000 0.3750 0.2222
Priors 0.3333 0.3704 0.2963 NaN
Classification Report for Fonction:
precision recall f1-score support
chasse 0.6364 0.7778 0.7000 9.0000
compagnie 0.8182 0.9000 0.8571 10.0000
utilite 1.0000 0.6250 0.7692 8.0000
accuracy 0.7778 0.7778 0.7778 0.7778
macro avg 0.8182 0.7676 0.7755 27.0000
weighted avg 0.8114 0.7778 0.7787 27.0000
Evaluation of prediction on testing dataset#
[18]:
#test data
DTest = load_canines("test")
print(DTest.info())
<class 'pandas.core.frame.DataFrame'>
Index: 6 entries, Medor to Wisky
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Taille 6 non-null object
1 Poids 6 non-null object
2 Velocite 6 non-null object
3 Intelligence 6 non-null object
4 Affection 6 non-null object
5 Agressivite 6 non-null object
6 Fonction 6 non-null object
dtypes: object(7)
memory usage: 384.0+ bytes
None
[19]:
#display test data
print(DTest)
Taille Poids Velocite Intelligence Affection Agressivite \
Chien
Medor Taille+ Poids- Veloc- Intell++ Affec- Agress+
Djeck Taille++ Poids++ Veloc+ Intell+ Affec+ Agress-
Taico Taille- Poids+ Veloc++ Intell++ Affec+ Agress+
Rocky Taille+ Poids+ Veloc+ Intell- Affec+ Agress-
Boudog Taille- Poids- Veloc++ Intell+ Affec- Agress+
Wisky Taille+ Poids++ Veloc- Intell- Affec+ Agress+
Fonction
Chien
Medor chasse
Djeck compagnie
Taico utilite
Rocky chasse
Boudog compagnie
Wisky utilite
[20]:
#split into X and y
yTest, XTest = DTest["Fonction"], DTest.drop(columns=["Fonction"])
Coordinates of new individuals#
[21]:
#coordinates of new individuals
print(clf.transform(XTest))
Can1 Can2
Medor 0.032129 0.274433
Djeck -0.010731 0.316056
Taico 0.014460 0.135778
Rocky 0.521477 -0.752048
Boudog 0.192426 0.234255
Wisky -0.112613 0.696326
Prediction on new indiduals#
[22]:
#predict on supplementary
print(clf.predict(XTest))
Medor compagnie
Djeck compagnie
Taico compagnie
Rocky chasse
Boudog compagnie
Wisky utilite
Name: prediction, dtype: object
Evaluation#
[23]:
#eval on new data
eval_test = clf.eval_predict(XTest, yTest,verbose=True)
Observation Profile:
Read Used
Number of Observations 6 6
Number of Observations Classified into Fonction:
prediction chasse compagnie utilite Total
Fonction
chasse 1 1 0 2
compagnie 0 2 0 2
utilite 0 1 1 2
Total 1 4 1 6
Percent Classified into Fonction:
prediction chasse compagnie utilite Total
Fonction
chasse 50.000000 50.000000 0.000000 100.0
compagnie 0.000000 100.000000 0.000000 100.0
utilite 0.000000 50.000000 50.000000 100.0
Total 16.666667 66.666667 16.666667 100.0
Priors 0.333333 0.370370 0.296296 NaN
Error Count Estimates for Fonction:
chasse compagnie utilite Total
Rate 0.500000 0.00000 0.500000 0.314815
Priors 0.333333 0.37037 0.296296 NaN
Classification Report for Fonction:
precision recall f1-score support
chasse 1.000000 0.500000 0.666667 2.000000
compagnie 0.500000 1.000000 0.666667 2.000000
utilite 1.000000 0.500000 0.666667 2.000000
accuracy 0.666667 0.666667 0.666667 0.666667
macro avg 0.833333 0.666667 0.666667 6.000000
weighted avg 0.833333 0.666667 0.666667 6.000000
Correspondence Analysis#
Eigenvalues#
[24]:
#eigenvalues
print(clf.eig_)
Eigenvalue Difference Proportion (%) Cumulative (%)
Can1 0.345864 0.227414 74.48927 74.48927
Can2 0.118450 NaN 25.51073 100.00000
Classes#
[25]:
#classes
classes = clf.classes_
classes._fields
[25]:
('infos', 'coord', 'eucl', 'gen')
Classes level informations#
[26]:
#classes level information
print(classes.infos)
Frequency Proportion Prior Probability
chasse 9 0.333333 0.333333
compagnie 10 0.370370 0.370370
utilite 8 0.296296 0.296296
Classes coordinates#
[27]:
#classes coordinates
print(classes.coord)
Can1 Can2
chasse -0.167113 -0.476797
compagnie 0.714651 0.162645
utilite -0.705312 0.333090
Classes squared euclidean distance#
[28]:
#classes squared euclidean distance
print(classes.eucl)
chasse compagnie utilite
chasse 0.000000 1.186395 0.945574
compagnie 1.186395 0.000000 2.045346
utilite 0.945574 2.045346 0.000000
Classes generalized squared distance#
[29]:
#classes generalized squared distance
print(classes.gen)
chasse compagnie utilite
chasse 2.197225 3.172899 3.378365
compagnie 3.383620 1.986504 4.478137
utilite 3.142799 4.031850 2.432791
Variables#
[30]:
#variables
var = clf.var_
var._fields
[30]:
('coord', 'eta2')
Variables coordinates#
[31]:
#variables coordinates
print(var.coord)
Can1 Can2
Taille+ 0.615447 -0.270601
Taille++ -0.672278 -0.006473
Taille- 1.000991 0.207157
Poids+ -0.158972 -0.482984
Poids++ -1.199302 0.967820
Poids- 1.027765 0.240335
Veloc+ 0.465513 -0.456397
Veloc++ -0.524295 -0.133070
Veloc- 0.099455 0.484880
Intell+ 0.311993 0.158107
Intell++ -0.491839 0.100882
Intell- -0.138108 -0.332586
Affec+ 0.656065 0.277906
Affec- -0.706532 -0.299283
Agress+ -0.430926 0.310489
Agress- 0.400145 -0.288311
Variables squared correlation ratio#
[32]:
#variables squared correlation ratio
print(var.eta2)
Can1 Can2
Taille 0.859012 0.184116
Poids 0.686207 0.804114
Velocite 0.328546 0.664200
Intelligence 0.124170 0.012718
Affection 0.661297 0.010539
Agressivite 0.205250 0.107174
Individuals#
[33]:
#individuals
ind = clf.ind_
ind._fields
[33]:
('coord', 'eucl', 'gen')
Individuals coordinates#
[34]:
#individuals coordinates
print(ind.coord)
Can1 Can2
Beauceron -0.231936 0.060037
Basset 0.241638 0.295881
Berger All -0.459741 0.032325
Boxer 0.413511 -0.224446
Bull-Dog 0.990876 0.523040
Bull-Mastif -0.963955 0.754635
Caniche 0.866811 0.039503
Chihuahua 0.863318 0.285415
Cocker 0.646090 0.581656
Colley 0.003587 -0.229940
Dalmatien 0.649034 -0.514423
Doberman -0.845897 -0.247187
Dogue All -1.040478 0.245472
Epag. Breton 0.421230 -0.542135
Epag. Français -0.102060 -0.666027
Fox-Hound -0.745651 -0.457100
Fox-Terrier 0.859092 0.357192
Gd Bleu Gasc -0.465142 -0.613675
Labrador 0.649034 -0.514423
Levrier -0.510127 -0.747077
Mastiff -0.863708 0.544722
Pekinois 0.863318 0.285415
Pointer -0.610374 -0.537164
St-Bernard -0.736151 0.782347
Setter -0.382570 -0.509452
Teckel 0.990876 0.523040
Terre-Neuve -0.500627 0.492370
Individuals squared euclidean distance#
[35]:
#individuals squared euclidean distance
print(ind.eucl)
chasse compagnie utilite
Beauceron 0.292393 0.906557 0.298642
Basset 0.764109 0.241493 0.898098
Berger All 0.344836 1.396180 0.150765
Boxer 0.400805 0.240525 1.562610
Bull-Dog 2.340613 0.206184 2.913132
Bull-Mastif 2.151381 3.168171 0.244597
Caniche 1.335566 0.038317 2.557765
Chihuahua 1.642757 0.037174 2.462872
Cocker 1.781623 0.180271 1.888071
Colley 0.090077 0.659736 0.819541
Dalmatien 0.667513 0.462727 2.552532
Doberman 0.513468 2.603274 0.356485
Dogue All 1.284437 3.087337 0.120013
Epag. Breton 0.350417 0.582811 2.035116
Epag. Français 0.040040 1.353716 1.362147
Fox-Hound 0.335093 2.516566 0.626027
Fox-Terrier 1.748634 0.058711 2.447939
Gd Bleu Gasc 0.107556 1.994584 0.954046
Labrador 0.667513 0.462727 2.552532
Levrier 0.190710 2.327676 1.204858
Mastiff 1.528746 2.637202 0.069878
Pekinois 1.642757 0.037174 2.462872
Pointer 0.200124 2.245424 0.766355
St-Bernard 1.909247 2.488857 0.202783
Setter 0.047488 1.655608 0.814040
Teckel 2.340613 0.206184 2.913132
Terre-Neuve 1.050516 1.585620 0.067266
Individuals generalized squared distance#
[36]:
#individuals generalized squared dsistance
print(ind.gen)
chasse compagnie utilite
Beauceron 2.489617 2.893060 2.731433
Basset 2.961333 2.227997 3.330889
Berger All 2.542060 3.382683 2.583555
Boxer 2.598030 2.227029 3.995401
Bull-Dog 4.537837 2.192688 5.345923
Bull-Mastif 4.348606 5.154674 2.677387
Caniche 3.532790 2.024820 4.990555
Chihuahua 3.839981 2.023678 4.895663
Cocker 3.978847 2.166775 4.320861
Colley 2.287301 2.646239 3.252331
Dalmatien 2.864738 2.449231 4.985323
Doberman 2.710693 4.589777 2.789276
Dogue All 3.481661 5.073841 2.552804
Epag. Breton 2.547642 2.569315 4.467907
Epag. Français 2.237265 3.340220 3.794938
Fox-Hound 2.532318 4.503069 3.058818
Fox-Terrier 3.945859 2.045215 4.880730
Gd Bleu Gasc 2.304781 3.981088 3.386836
Labrador 2.864738 2.449231 4.985323
Levrier 2.387934 4.314180 3.637648
Mastiff 3.725970 4.623705 2.502668
Pekinois 3.839981 2.023678 4.895663
Pointer 2.397348 4.231927 3.199146
St-Bernard 4.106471 4.475361 2.635573
Setter 2.244712 3.642112 3.246830
Teckel 4.537837 2.192688 5.345923
Terre-Neuve 3.247740 3.572124 2.500056
Discriminat analysis#
Canonical coefficients#
[37]:
#canonical coefficients
cancoef = clf.cancoef_
cancoef._fields
[37]:
('standardized', 'projection')
Standardized canonical coefficients#
[38]:
#standardized canonical coefficients
print(cancoef.standardized)
Can1 Can2
Taille+ 1.046497 -0.786254
Taille++ -1.143132 -0.018807
Taille- 1.702072 0.601912
Poids+ -0.270314 -1.403349
Poids++ -2.039276 2.812078
Poids- 1.747598 0.698312
Veloc+ 0.791551 -1.326097
Veloc++ -0.891503 -0.386645
Veloc- 0.169112 1.408858
Intell+ 0.530508 0.459393
Intell++ -0.836317 0.293122
Intell- -0.234837 -0.966356
Affec+ 1.115564 0.807478
Affec- -1.201376 -0.869592
Agress+ -0.732740 0.902151
Agress- 0.680402 -0.837712
Projection canonical coefficients#
[39]:
#projection canonical coefficients
print(cancoef.projection)
Can1 Can2
Taille+ 0.174416 -0.131042
Taille++ -0.190522 -0.003135
Taille- 0.283679 0.100319
Poids+ -0.045052 -0.233892
Poids++ -0.339879 0.468680
Poids- 0.291266 0.116385
Veloc+ 0.131925 -0.221016
Veloc++ -0.148584 -0.064441
Veloc- 0.028185 0.234810
Intell+ 0.088418 0.076566
Intell++ -0.139386 0.048854
Intell- -0.039140 -0.161059
Affec+ 0.185927 0.134580
Affec- -0.200229 -0.144932
Agress+ -0.122123 0.150359
Agress- 0.113400 -0.139619
plotting#
[40]:
from discrimintools import fviz_dica
Graph of individuals#
[41]:
#graph of individuals
p = fviz_dica(clf,element="ind",repel=True)
p.show()
1 [0.81815924 0.81476315]
9 [ 0.47327908 -0.32005552]
we add supplementary individuals
[42]:
#with supplementary individuals
from discrimintools import add_scatter
p = add_scatter(p,clf.transform(XTest),color="blue",repel=True)
print(p.show())
1 [-0.43194741 -0.39709514]
9 [-0.62327028 0.01822284]
None
Graph of variables/categories#
[43]:
#graph of variables/categories
p = fviz_dica(clf,element="var",repel=True)
p.show()
Biplot of individuals and variables/categories#
[ ]:
#biplot of individuals and variables/categories
p = fviz_dica(clf,element="var",repel=False)
p.show()
Graph of qualitative variables#
[45]:
#graph of qualitative variables
p = fviz_dica(clf,element="quali_var",repel=True)
p.show()
Distance between barycenter#
[46]:
#Distance between barycenter
p = fviz_dica(clf,element="ind",repel=True)
p.show()
1 [ 0.57373015 -0.49290433]
9 [0.76975701 0.08868503]