Data dimension: 30000 x 25
Missing values: 0
Y table:
Y
0 1
23364 6636
Y proportion:
Y
0 1
0.7788 0.2212
==== Descriptive statistics ====
Variable Mean SD Min Median Max
X1 X1 167484.32 129747.66 10000 140000 1000000
X2 X2 1.60 0.49 1 2 2
X3 X3 1.85 0.79 0 2 6
X4 X4 1.55 0.52 0 2 3
X5 X5 35.49 9.22 21 34 79
X6 X6 -0.02 1.12 -2 0 8
X7 X7 -0.13 1.20 -2 0 8
X8 X8 -0.17 1.20 -2 0 8
X9 X9 -0.22 1.17 -2 0 8
X10 X10 -0.27 1.13 -2 0 8
X11 X11 -0.29 1.15 -2 0 8
X12 X12 51223.33 73635.86 -165580 22382 964511
X13 X13 49179.08 71173.77 -69777 21200 983931
X14 X14 47013.15 69349.39 -157264 20088 1664089
X15 X15 43262.95 64332.86 -170000 19052 891586
X16 X16 40311.40 60797.16 -81334 18104 927171
X17 X17 38871.76 59554.11 -339603 17071 961664
X18 X18 5663.58 16563.28 0 2100 873552
X19 X19 5921.16 23040.87 0 2009 1684259
X20 X20 5225.68 17606.96 0 1800 896040
X21 X21 4826.08 15666.16 0 1500 621000
X22 X22 4799.39 15278.31 0 1500 426529
X23 X23 5215.50 17777.47 0 1500 528666
null device
1
null device
1
==== Logistic confusion matrix ====
Predicted
Actual 0 1
0 4539 134
1 980 348
==== Logistic metrics ====
Accuracy Sensitivity Specificity Precision ErrorRate
1 0.8144 0.262 0.9713 0.722 0.1856
Logistic AUC: 0.7253
==== Top logistic coefficients by |z| ====
Estimate Std. Error z value Pr(>|z|)
X6 0.5809 0.0198 29.398 0e+00
X18 0.0000 0.0000 -6.904 0e+00
X12 0.0000 0.0000 -5.054 0e+00
X3 -0.1057 0.0234 -4.513 0e+00
X5 0.0085 0.0020 4.235 0e+00
X4 -0.1484 0.0356 -4.174 0e+00
X1 0.0000 0.0000 -4.167 0e+00
X8 0.0915 0.0254 3.608 3e-04
==== PCA variance table ====
PC Eigenvalue Proportion Cumulative
1 PC1 6.5431 0.2845 0.2845
2 PC2 4.0983 0.1782 0.4627
3 PC3 1.5510 0.0674 0.5301
4 PC4 1.4723 0.0640 0.5941
5 PC5 1.0252 0.0446 0.6387
6 PC6 0.9572 0.0416 0.6803
7 PC7 0.9076 0.0395 0.7198
8 PC8 0.8876 0.0386 0.7584
9 PC9 0.8712 0.0379 0.7962
10 PC10 0.7829 0.0340 0.8303
PCA k eigenvalue>1: 5
PCA k cumulative>=80%: 10
==== PCA top variables ====
PC1_positive PC1_negative PC2_positive PC2_negative
1 X15 X2 X9 X1
2 X16 X4 X8 X20
3 X14 X5 X7 X18
4 X13 X3 X10 X14
5 X17 X1 X11 X15
null device
1
null device
1
null device
1
==== Factor analysis loadings ====
Variable Factor1 Factor2 Factor3 Factor4 Factor5 Communality Uniqueness
X1 X1 0.293 -0.358 0.286 0.149 0.053 0.320 0.680
X2 X2 -0.015 -0.069 -0.013 0.020 0.014 0.006 0.994
X3 X3 0.011 0.127 -0.072 -0.079 -0.024 0.028 0.972
X4 X4 -0.032 0.048 -0.004 -0.007 -0.008 0.003 0.997
X5 X5 0.060 -0.071 0.037 0.006 0.003 0.010 0.990
X6 X6 0.140 0.619 -0.163 -0.063 -0.043 0.435 0.565
X7 X7 0.159 0.755 -0.137 -0.065 -0.039 0.621 0.379
X8 X8 0.131 0.830 -0.061 -0.039 -0.021 0.711 0.289
X9 X9 0.111 0.891 -0.016 0.004 0.069 0.812 0.188
X10 X10 0.111 0.880 0.013 0.078 0.064 0.797 0.203
X11 X11 0.130 0.806 -0.001 0.127 0.052 0.685 0.315
X12 X12 0.915 0.127 0.199 -0.086 -0.119 0.914 0.086
X13 X13 0.933 0.152 0.279 -0.081 -0.127 0.995 0.005
X14 X14 0.934 0.145 0.210 -0.071 0.231 0.995 0.005
X15 X15 0.912 0.157 0.166 0.172 0.115 0.927 0.073
X16 X16 0.909 0.153 0.095 0.359 0.079 0.995 0.005
X17 X17 0.871 0.154 0.110 0.317 0.076 0.901 0.099
X18 X18 0.126 -0.013 0.621 0.116 0.028 0.416 0.584
X19 X19 0.097 -0.053 0.384 0.031 0.668 0.607 0.393
X20 X20 0.095 -0.050 0.371 0.390 0.009 0.302 0.698
X21 X21 0.146 -0.075 0.204 0.407 0.013 0.234 0.766
X22 X22 0.121 -0.060 0.262 0.044 0.103 0.099 0.900
X23 X23 0.128 -0.052 0.278 0.062 0.068 0.105 0.895
Factor variance proportions:
Factor1 Factor2 Factor3 Factor4 Factor5
0.2295 0.1811 0.0525 0.0294 0.0257
Factor cumulative variance: 0.2295 0.4106 0.4631 0.4925 0.5182
==== Factor top variables ====
Factor1 Factor2 Factor3
1 X14, X13, X12, X15, X16 X9, X10, X8, X11, X7 X18, X19, X20, X1, X13
Factor4 Factor5
1 X21, X20, X16, X17, X15 X19, X14, X13, X12, X15
null device
1
null device
1
==== Hierarchical clustering metrics ====
Method Accuracy Sensitivity Specificity Precision ErrorRate
1 single 0.7786 0.0000 0.9996 0.0000 0.2214
2 average 0.7786 0.0000 0.9996 0.0000 0.2214
3 complete 0.7786 0.0000 0.9996 0.0000 0.2214
4 ward.D2 0.6969 0.1554 0.8506 0.2279 0.3031
null device
1
==== Kmeans clustering metrics ====
Algorithm Accuracy Sensitivity Specificity Precision ErrorRate WithinSS
1 Hartigan-Wong 0.6906 0.1492 0.8443 0.2140 0.3094 568474
2 Lloyd 0.6903 0.1493 0.8439 0.2137 0.3097 568474
3 Forgy 0.6903 0.1493 0.8439 0.2137 0.3097 568474
4 MacQueen 0.6903 0.1493 0.8439 0.2137 0.3097 568474
null device
1
==== Spectral clustering metrics ====
Method Accuracy Sensitivity Specificity
1 manual normalized spectral clustering 0.6389 0.4868 0.682
Precision ErrorRate
1 0.3028 0.3611
null device
1
==== LDA train confusion ====
Predicted
Actual 0 1
0 15832 522
1 3407 1238
==== LDA train metrics ====
Accuracy Sensitivity Specificity Precision ErrorRate
1 0.8129 0.2665 0.9681 0.7034 0.1871
==== LDA test confusion ====
Predicted
Actual 0 1
0 6770 240
1 1448 543
==== LDA test metrics ====
Accuracy Sensitivity Specificity Precision ErrorRate
1 0.8125 0.2727 0.9658 0.6935 0.1875
null device
1
==== QDA train confusion ====
Predicted
Actual 0 1
0 6543 9811
1 817 3828
==== QDA train metrics ====
Accuracy Sensitivity Specificity Precision ErrorRate
1 0.4939 0.8241 0.4001 0.2807 0.5061
==== QDA test confusion ====
Predicted
Actual 0 1
0 2816 4194
1 308 1683
==== QDA test metrics ====
Accuracy Sensitivity Specificity Precision ErrorRate
1 0.4998 0.8453 0.4017 0.2864 0.5002