1. Edwards D. J., Holt K. E. Beginner's guide to comparative bacterial genome analysis using nextgeneration sequence data. Microbial Informatics and Experimentation, 2013, vol. 3:2, pp. 1-9.
2. Bao J., Yuan R., Bao Z. An improved alignment-free model for DNA sequence similarity metric. BMC Bioinformatics, 2014, vol. 15:312, pp. 1-15.
3. Li C., Wang J. Relative entropy of DNA and its application. Physica A, 2005, vol. 347, pr. 465-471.
4. Dai Q., Liu X., Yao Y., Zhao F. Numerical characteristics of word frequencies and their application to dissimilarity measure for sequence comparison. Journal of Theoretical Biology, 2011, vol. 276, pr. 174-180.
5. Liu L., Ho Y. K., Yau S. Clustering DNA sequences by feature vectors. Mol Phylogenet Evol, 2006, vol. 41, pr. 64-69.
6. Wang J., Zheng X. Wse, a new sequence distance measure based on word frequencies. Mathematical Biosciences, 2008, vol. 215, pr. 78-83.
7. Zhao B., He R. L., Yau S. T. A new distribution vector and its application in genome clustering. Mol Phylogenet Evol, 2011, vol. 59, pr. 438-443.
8. Bermingham M. L., Pong-Wong R., Spiliopoulou A., Hayward C., Rudan I., …, Haley C. S. Application of high-dimensional feature selection: evaluation for genomic prediction in man. Scientific Reports, 2015, vol. 5:10312, pp. 1-12.
9. GFF/GTF File Format - Definition and Supported Options, 2014. Available at: www.ensembl.org info/website/upload/gff.html (accessed 16.10.2014).
10. Mao R., Kumar P. K. R., Guo C., Zhang Y., Liang C. Comparative analyses between retained introns and constitutively spliced introns in arabidopsos thaliana using random forest and support vector machine. PLoS One, 2014, vol. 9, no. 8, pr. 1-12.
11. Syrakvash D. A., Jackov N. N., Nazarov P. V., Skakun V. V. Razrabotka algoritmov i avtomatizirovannyh programmnyh sredstv dlya klassifikacii kodirujushchih i nekodiruyushchih nukleotidnyh posledovatel’nostey [Development of algorithms and automated software for the classification of coding and non-coding nucleotide sequences]. Mejdunarodnyi congress po informatike: informacionnye sistemy i tehnologii [International Congress on Informatics: Information Systems and Technologies]. Minsk, Belorusskij gosudarstvennyj universitet, 2016, pp. 189-193 (in Russian).
12. Fernández-Delgado M., Cernadas E., Barro S., Amorim D. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 2014, vol. 15, pr. 3133-3181.
13. Liaw A., Wiener M. Breiman and Custler’s Random Forests for Classification and Regression, 2016. Available at: http://www.stat.berkley.edu/~breiman/RandomForest/cc_home.htm#workings (accessed 11.02.2016).
14. Breiman L. Random forest. Machine Learning, 2001, vol. 45(1), pr. 5-32.
15. Vapnik V. N. Vosstanovlenie zavisimostey po empiricheskim dannym. Recovering Dependencies from Empirical Data. Moscow, Nauka, 1979, 448 p. (in Russian).
16. V’ugin V. V. Matematicheskie osnovy mashinnogo obucheniya i prognozirovaniya. Mathematical Foundations of Machine Learning and Prediction. Moscow, Moskovskij centr nepreryvnogo matematicheskogo obrazovanija, 2014, 304 p. (in Russian).
17. Mastickiy C. E., Shitikov V. K. Statisticheskiy analiz i vizualizaciya dannyh s pomoshchju R. Statistical Analysis and Data Visualization with R, 2014. Available at: http://r-analytics.blogspot.com (accessed 13.03.2015) (in Russian).
18. Zhao Z., Sharma S., Morstatter F., Alelyani S. Advancing Feature Selection Research - ASU Feature Selection Repository, 2010. Available at: https://www.researchgate.net/publication/305083748_Advancing_ feature_selection_research (accessed 10.04.2019).
19. Kuhn M. The Caret Package, 2017. Available at: https://topepo.github.io/caret (accessed 11.04.2017)