Comparative study of quality estimation of binary classification

V. V. Starovoitov; Yu. I. Golub

doi:10.37661/1816-0301-2020-17-1-87-101

Comparative study of quality estimation of binary classification

V. V. Starovoitov, Yu. I. Golub

https://doi.org/10.37661/1816-0301-2020-17-1-87-101

Full Text:

PDF (Rus)

Generate QR code

Abstract

The paper describes results of analytical and experimental analysis of seventeen functions used for evaluation of binary classification results of arbitrary data. The results are presented by 2×2 error matrices. The behavior and properties of the main functions calculated by the elements of such matrices are studied. Classification options with balanced and imbalanced datasets are analyzed. It is shown that there are linear dependencies between some functions, many functions are invariant to the transposition of the error matrix, which allows us to calculate the estimation without specifying the order in which their elements were written to the matrices.

It has been proven that all classical measures such as Sensitivity, Specificity, Precision, Accuracy, F1, F2, GM, the Jacquard index are sensitive to the imbalance of classified data and distort estimation of smaller class objects classification errors. Sensitivity to imbalance is found in the Matthews correlation coefficient and Kohen’s kappa. It has been experimentally shown that functions such as the confusion entropy, the discriminatory power, and the diagnostic odds ratio should not be used for analysis of binary classification of imbalanced datasets. The last two functions are invariant to the imbalance of classified data, but poorly evaluate results with approximately equal common percentage of classification errors in two classes.

We proved that the area under the ROC curve (AUC) and the Yuden index calculated from the binary classification confusion matrix are linearly dependent and are the best estimation functions of both balanced and imbalanced datasets.

Keywords

binary classification, confusion matrix, functions of Accuracy classification, area under ROC curve, Youden’s index

About the Authors

V. V. Starovoitov

The United Institute of Informatics Problems of the National Academy of Sciences of Belarus
Belarus
Valery V. Starovoitov, Dr. Sci. (Eng.), Professor, Chief Researcher

Yu. I. Golub

The United Institute of Informatics Problems of the National Academy of Sciences of Belarus
Yuliya I. Golub, Cand. Sci. (Eng.), Associate Professor, Senior Researcher

References

1. Zhuravlev Y. I. On the algebraic approach to solving problems of recognition and classification. Problems of cybernetics, Moscow, Nauka, 1978, vol. 33, рр. 5–68.

2. Haixiang G., Shang J., Mingyun G., Yuanyue H., Bing G. Learning from class-imbalanced data: Review of methods and applications. Expert Systems with Applications, 2017, vol. 73, рр. 220–239.

3. Choi S. S., Cha S. H., Tappert C. C. A survey of binary similarity and distance measures. Journal of Systemics, Cybernetics and Informatics, 2010, vol. 8(1), рр. 43–48.

4. Canbek G., Sagiroglu S., Temizel T. T., Baykal N. Binary classification performance measures/metrics: A comprehensive visualized roadmap to gain new insights. International Conference on Computer Science and Engineering, Antalya, Turkey, 5–8 October 2017. Antalya, 2017, рр. 821–826.

5. Sokolova M., Lapalme G. A systematic analysis of performance measures for classification tasks. Information Processing & Management, 2009, vol. 45, no. 4, рр. 427–437.

6. Valverde-Albacete F. J., Peláez-Moreno C. 100 % classification accuracy considered harmful: the normalized information transfer factor explains the accuracy paradox. PLoS One, 2014, vol. 9(1), 10 р. https://doi.org/10.1371/journal.pone.0084217

7. Powers D. M. What the F-measure doesn't measure: Features, Flaws, Fallacies and Fixes, 2015. Available at: https://arxiv.org/abs/1503.06410 (accessed 17.11.2019).

8. Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters, 2006, vol. 27, no. 8, рр. 861–874.

9. Cohen J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 1960, vol. 20, no. 1, рр. 37–46.

10. Matthews B. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta – Protein Structure, 1975, vol. 405, no. 2, рр. 442–451.

11. Wei J. M., Yuan X. J., Hu Q. H., Wang S. Q. A novel measure for evaluating classifiers. Expert Systems with Applications, 2010, vol. 37, no. 5, рр. 3799–3809.

12. Blakeley D. D., Oddone E. Z., Hasselblad V., Simel D. L., Matchar D. B. Noninvasive carotid artery testing: a meta-analytic review. Annals of Internal Medicine, 1995, vol. 122, no. 5, рр. 360–367.

13. Youden W. J. Index for rating diagnostic tests. Cancer, 1950, vol. 3, no. 1, рр. 32–35.

14. Glas A. S., Lijmer J. G., Prins M. H., Bonsel G. J., Bossuyt P. M. The diagnostic odds ratio: a single indicator of test performance. Journal of Clinical Epidemiology, 2003, vol. 56, no. 11, рр. 1129–1135.

15. Davis J., Goadrich M. The relationship between Precision-Recall and ROC curves. Proceedings of the 23rd International Conference on Machine Learning, 25–29 June 2006, Pittsburgh, Pennsylvania, USA. Pittsburgh, 2006, рр. 233–240.

16. Boughorbel S., Jarray F., El-Anbari M. Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric. PloS One, 2017, vol. 12(6). https://doi.org/10.1371/journal.pone.0177678

17. Jurman G., Riccadonna S., Furlanello C. A comparison of MCC and CEN error measures in multi-class prediction. PloS One, 2012, vol. 7, no. 8, e41882. https://doi.org/10.1371/journal.pone.0041882

18. Pepe M. S., Janes H., Longton G., Leisenring W., Newcomb P. Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. American Journal of Epidemiology, 2004, vol. 159, no. 9, рр. 882–890.

19. Mower J. P. PREP-Mt: predictive RNA editor for plant mitochondrial genes. BMC Bioinformatics, 2005, vol. 6, art. 96, рр. 1–15. https://doi.org/10.1186/1471-2105-6-96

Review

For citations:

Starovoitov V.V., Golub Yu.I. Comparative study of quality estimation of binary classification. Informatics. 2020;17(1):87-101. (In Russ.) https://doi.org/10.37661/1816-0301-2020-17-1-87-101

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 1816-0301 (Print)
ISSN 2617-6963 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Informatics

Comparative study of quality estimation of binary classification

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy