<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">inform</journal-id><journal-title-group><journal-title xml:lang="ru">Информатика</journal-title><trans-title-group xml:lang="en"><trans-title>Informatics</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">1816-0301</issn><issn pub-type="epub">2617-6963</issn><publisher><publisher-name>UIIP NASB</publisher-name></publisher></journal-meta><article-meta><article-id custom-type="elpub" pub-id-type="custom">inform-125</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>БИОИНФОРМАТИКА</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>BIOINFORMATICS</subject></subj-group></article-categories><title-group><article-title>МЕТОД ПОСТРОЕНИЯ КЛАСТЕРОВ ГЕНЕТИЧЕСКИХ ДАННЫХ</article-title><trans-title-group xml:lang="en"><trans-title>METHOD OF CONSTRUCTION OF GENETIC DATA CLUSTERS</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Новоселова</surname><given-names>Н. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Novoselova</surname><given-names>N. A.</given-names></name></name-alternatives><email xlink:type="simple">novosel@newman.bas-net.by</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Том</surname><given-names>И. Э.</given-names></name><name name-style="western" xml:lang="en"><surname>Tom</surname><given-names>I. E.</given-names></name></name-alternatives><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff xml:lang="ru" id="aff-1"><institution>Объединенный институт проблем информатики НАН Беларуси</institution><country>Belarus</country></aff><pub-date pub-type="collection"><year>2016</year></pub-date><pub-date pub-type="epub"><day>03</day><month>10</month><year>2016</year></pub-date><volume>0</volume><issue>1</issue><fpage>64</fpage><lpage>74</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Новоселова Н.А., Том И.Э., 2016</copyright-statement><copyright-year>2016</copyright-year><copyright-holder xml:lang="ru">Новоселова Н.А., Том И.Э.</copyright-holder><copyright-holder xml:lang="en">Novoselova N.A., Tom I.E.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://inf.grid.by/jour/article/view/125">https://inf.grid.by/jour/article/view/125</self-uri><abstract><p>Предлагается метод построения кластеров (функциональных модулей) генетических данных, основанный на использовании рандомизированных матриц. Для построения кластеров выполняется выделение и анализ главных компонент матрицы корреляций генных профилей. В качестве конечных выбираются главные компоненты, которые соответствуют собственным значениям, значимо отличающимся от полученных при анализе случайным образом сгенерированной корреляционной матрицы (рандомизированной). В сравнительном вычислительном эксперименте с аналогами метод показал свое преимущество в возможности выделять статистически значимые кластеры малых и больших размеров, способности отфильтровывать неинформативные признаки, а также получать биологически интерпретируемые функциональные модули, адекватные реальной структуре данных.</p></abstract><trans-abstract xml:lang="en"><p>The paper presents a method of construction of genetic data clusters (functional modules) using the randomized matrices. To build the functional modules the selection and analysis of the eigenvalues of the gene profiles correlation matrix is performed. The principal components, corresponding to the eigenvalues, which are significantly different from those obtained for the randomly generated correlation matrix, are used for the analysis. Each selected principal component forms gene cluster. In a comparative experiment with the analogs the proposed method shows the advantage in allocating statistically significant different-sized clusters, the ability to filter non- informative genes and to extract the biologically interpretable functional modules matching the real data structure.</p></trans-abstract></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Liang, S. REVEAL, a general reverse engineering algorithm for inference of genetic network architectures / S. Liang, S. Fuhrman, R. Somogyi // Pacific Symp. on Biocomputing (PSB’98). – Hawaii, 1998. – Vol. 3. – P. 18–29.</mixed-citation><mixed-citation xml:lang="en">Liang, S. REVEAL, a general reverse engineering algorithm for inference of genetic network architectures / S. Liang, S. Fuhrman, R. Somogyi // Pacific Symp. on Biocomputing (PSB’98). – Hawaii, 1998. – Vol. 3. – P. 18–29.</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Cluster analysis and display of genome-wide expression patterns / M.B. Eisen [et al.] // Proceedings of the National Academy of Sciences of the United States of America. – 1998. – Vol. 95. – P. 14863–14868.</mixed-citation><mixed-citation xml:lang="en">Cluster analysis and display of genome-wide expression patterns / M.B. Eisen [et al.] // Proceedings of the National Academy of Sciences of the United States of America. – 1998. – Vol. 95. – P. 14863–14868.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Analysis of gene expression data using self-organizing maps / P. Toronen [et al.] // FEBS Letters. – 1999. – Vol. 451. – P. 142–146.</mixed-citation><mixed-citation xml:lang="en">Analysis of gene expression data using self-organizing maps / P. Toronen [et al.] // FEBS Letters. – 1999. – Vol. 451. – P. 142–146.</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">The R Project for Statistical Computing. R Foundation for Statistical Computing [Electronic resource]. – 2009. – Mode of access : http://www.R-project.org. – Date of access : 10.09.2015.</mixed-citation><mixed-citation xml:lang="en">The R Project for Statistical Computing. R Foundation for Statistical Computing [Electronic resource]. – 2009. – Mode of access : http://www.R-project.org. – Date of access : 10.09.2015.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Bioconductor case studies / F. Hahne [et al.]. – Springer Science &amp; Business Media, 2010. – 296 p.</mixed-citation><mixed-citation xml:lang="en">Bioconductor case studies / F. Hahne [et al.]. – Springer Science &amp; Business Media, 2010. – 296 p.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Cluster – Cluster analysis and visualization software [Electronic resource]. – 2015. – Mode of access : http://rana.lbl.gov/EisenSoftware.htm. – Date of access : 19.08.2015.</mixed-citation><mixed-citation xml:lang="en">Cluster – Cluster analysis and visualization software [Electronic resource]. – 2015. – Mode of access : http://rana.lbl.gov/EisenSoftware.htm. – Date of access : 19.08.2015.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Cyber-T – microarray analysis web interface from UCI’s Institute for Genomics and Bioinformatics [Electronic resource]. – 2015. – Mode of access : http://cybert.microarray.ics.uci.edu. – Date of access : 16.09.2015.</mixed-citation><mixed-citation xml:lang="en">Cyber-T – microarray analysis web interface from UCI’s Institute for Genomics and Bioinformatics [Electronic resource]. – 2015. – Mode of access : http://cybert.microarray.ics.uci.edu. – Date of access : 16.09.2015.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">SNOMAD – Standardization and normalization of microarray data [Electronic resource]. – 2015. – Mode of access : http://pevsnerlab.kennedykrieger.org/snomadinput.html. – Date of access : 12.09.2015.</mixed-citation><mixed-citation xml:lang="en">SNOMAD – Standardization and normalization of microarray data [Electronic resource]. – 2015. – Mode of access : http://pevsnerlab.kennedykrieger.org/snomadinput.html. – Date of access : 12.09.2015.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Yeast cell cycle analysis project [Electronic resource]. – 2015. – Mode of access : http://genome-www.stanford.edu/cellcycle. – Date of access : 10.04.2015.</mixed-citation><mixed-citation xml:lang="en">Yeast cell cycle analysis project [Electronic resource]. – 2015. – Mode of access : http://genome-www.stanford.edu/cellcycle. – Date of access : 10.04.2015.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Varimax – rotation methods for factor analysis [Electronic resource]. – 2015. – Mode of access : https://stat.ethz.ch/R-manual/R-devel/library/stats/html/varimax.html. – Date of access : 17.09.2015.</mixed-citation><mixed-citation xml:lang="en">Varimax – rotation methods for factor analysis [Electronic resource]. – 2015. – Mode of access : https://stat.ethz.ch/R-manual/R-devel/library/stats/html/varimax.html. – Date of access : 17.09.2015.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Morey, L.C. The measurement of classification agreement: an adjustment to the rand statistic for chance agreement / L.C. Morey, A. Agresti // Educational and Psychological Measurement. – 1984. – Vol. 44. – P. 33–37.</mixed-citation><mixed-citation xml:lang="en">Morey, L.C. The measurement of classification agreement: an adjustment to the rand statistic for chance agreement / L.C. Morey, A. Agresti // Educational and Psychological Measurement. – 1984. – Vol. 44. – P. 33–37.</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Chipman, H. Hybrid hierarchical clustering with applications to microarray data / H. Chipman, R. Tibshirani // Biostatistics. – 2006. – Vol. 7, № 2. – P. 286–301.</mixed-citation><mixed-citation xml:lang="en">Chipman, H. Hybrid hierarchical clustering with applications to microarray data / H. Chipman, R. Tibshirani // Biostatistics. – 2006. – Vol. 7, № 2. – P. 286–301.</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">YeastMine: saccharomyces genome database [Electronic resource]. – 2015. – Mode of access : http://yeastmine.yeastgenome.org/yeastmine/begin.do. – Date of access : 06.09.2015.</mixed-citation><mixed-citation xml:lang="en">YeastMine: saccharomyces genome database [Electronic resource]. – 2015. – Mode of access : http://yeastmine.yeastgenome.org/yeastmine/begin.do. – Date of access : 06.09.2015.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
