Algorithm for selecting reference microRNAs in biological processes classification
https://doi.org/10.37661/1816-0301-2025-22-3-45-58
Abstract
O b j e c t i v e s. The algorithm for selection of reference microRNA taking into account their biological features for classification of pathologies.
Development of an algorithm for selecting microRNAs with regard to their interconnection for samples classification in the various biological processes.
M e t h o d s. Methods of linear algebra, principal component analysis, statistical binary regression models, and model performance metrics were used.
R e s u l t s. A new algorithm, MDSeek, has been developed that proposes a selection of reference microRNA for the normalization quantitative polymerase chain reaction results taking into account their coexpression. MDSeek demonstrates higher performance metrics compared to known reference gene selection approaches for the subsequent classification tasks.
C o n c l u s i o n. An original MDSeek algorithm for selecting reference microRNAs for normalization results of polymerase chain reaction is suggested. It takes into account changes in microRNA expression when comparing different biological processes. After applying MDSeek to an experimental set of samples, the normalized data were used for classification tasks, and the performance metrics were better than those of other normalization algorithms.
About the Authors
O. V. KraskoBelarus
Olga V. Krasko - Ph. D. (Eng.), Assoc. Prof., Leading Researcher, The United Institute of Informatics Problems of the National Academy of Sciences of Belarus.
Surganova st., 6, Minsk, 220012
S. U. Yakubouski
Belarus
Siarhei U. Yakubouski - Ph. D. (Med.), Assoc. Prof., Assoc. Prof. of Department of Surgery and Transplantology with Advanced Training and Retraining Courses, Belarusian State Medical University.
Dzerzhinski av., 83, Minsk, 220116
V. N. Kipen
Belarus
Viachaslau N. Kipen - Ph. D. (Biol.), Assoc. Prof., Leading Researcher, The Institute of Genetics and Cytology of the National Academy of Sciences of Belarus.
Akademicheskaya st., 27, Minsk, 220072
References
1. Vandesompele J., De Preter K., Pattyn F., Poppe B., Van Roy N., …, Speleman F. Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biology, 2002, vol. 3, рр. 1–12.
2. Karlen Y., McNair A., Perseguers S., Mazza C., Mermod N. Statistical significance of quantitative PCR. BMC Bioinformatics, 2007, vol. 8, рр. 1–16.
3. Maltseva D. V., Khaustova N. A., Fedotov N. N., Matveeva E. O., Lebedev A. E., …, Tonevitsky A. G. High-throughput identification of reference genes for research and clinical RT-qPCR analysis of breast cancer samples. Journal of Clinical Bioinformatics, 2013, vol. 3, рр. 1–12.
4. Mar J. C., Kimura Y., Schroder K., Irvine K. M., Hayashizaki Y., …, Quackenbush J. Data-driven normalization strategies for high-throughput quantitative RT-PCR. BMC Bioinformatics, 2009, vol. 10, рр. 1–10.
5. Bustin S. A., V. Benes, J. A Garson, J. Hellemans, J. Huggett, …, Wittwer C. T. The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clinical Chemistry, 2009, vol. 55, no. 4, рр. 611–622.
6. Jacob F., Guertler R., Naim S., Nixdorf S., Fedier A., …, Heinzelmann-Schwarz V. Careful selection of reference genes is required for reliable performance of RT-qPCR in human normal and cancer cell lines. PloS One, 2013, vol. 8, no. 3, р. e59180.
7. Pfaffl M. W., Tichopad A., Prgomet C., Neuvians T. P. Determination of stable housekeeping genes, differentially regulated target genes and sample integrity: BestKeeper – Excel-based tool using pair-wise correlations. Biotechnology Letters, 2004, vol. 26, рр. 509–515.
8. Andersen C. L., Jensen J. L., Ørntoft T. F. Normalization of real-time quantitative reverse transcription-PCR data: a model-based variance estimation approach to identify genes suited for normalization, applied to bladder and colon cancer data sets. Cancer Research, 2004, vol. 64, no. 15, рр. 5245–5250.
9. Grabia S., Smyczynska U., Pagacz K., Fendler W. NormiRazor: tool applying GPU-accelerated computing for determination of internal references in microRNA transcription studies. BMC Bioinformatics, 2020, vol. 21, рр. 1–16.
10. Marabita F., de Candia P., Torri A., Tegnér J., Abrignani S., Rossi R. L. Normalization of circulating microRNA expression data obtained by quantitative real-time RT-PCR. Briefings in Bioinformatics, 2016, vol. 17, no. 2, рр. 204–212.
11. Friedman R. C., Farh K. K., Burge C. B., Bartel D. P. Most mammalian mRNAs are conserved targets of microRNAs. Genome Research, 2009, vol. 19, no. 1, рр. 92–105. DOI: 10.1101/gr.082701.108.
12. Iorio M. V., Croce C. M. MicroRNA dysregulation in cancer: diagnostics, monitoring and therapeutics. A comprehensive review. EMBO Molecular Medicine, 2012, vol. 4, no. 3, рр. 143–159. DOI: 10.1002/emmm.201100209.
13. Boufraqech M., Klubo-Gwiezdzinska J., Kebebew E. MicroRNAs in the thyroid. Best Practice & Research Clinical Endocrinology & Metabolism, 2016, vol. 30, iss. 5, рр. 603–619. DOI: 10.1016/j.beem.2016.10.001.
14. Yoshida K., Yokoi A., Yamamoto Y., Kajiyama H. ChrXq27.3 miRNA cluster functions in cancer development. Journal of Experimental & Clinical Cancer Research, 2021, vol. 40, iss. 1, р. 112. DOI: 10.1186/s13046-021-01910-0.
15. Mahalanobis, P. C. On the generalized distance in statistics. Proceedings of National Institute Science in India, 1936, vol. 2, рр. 49–55.
16. De Maesschalck R., Jouan-Rimbaud D., Massart D. L. The mahalanobis distance. Chemometrics and Intelligent Laboratory Systems, 2000, vol. 50, no. 1, рр. 1–18.
17. Touloumis A. Nonparametric Stein-type shrinkage covariance matrix estimators in high-dimensional settings. Computational Statistics & Data Analysis, 2015, vol. 83, рр. 251–261.
Review
For citations:
Krasko O.V., Yakubouski S.U., Kipen V.N. Algorithm for selecting reference microRNAs in biological processes classification. Informatics. 2025;22(3):45-58. (In Russ.) https://doi.org/10.37661/1816-0301-2025-22-3-45-58



















