<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">inform</journal-id><journal-title-group><journal-title xml:lang="ru">Информатика</journal-title><trans-title-group xml:lang="en"><trans-title>Informatics</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">1816-0301</issn><issn pub-type="epub">2617-6963</issn><publisher><publisher-name>UIIP NASB</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.37661/1816-0301-2023-20-1-102-112</article-id><article-id custom-type="elpub" pub-id-type="custom">inform-1234</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ИНФОРМАЦИОННЫЕ ТЕХНОЛОГИИ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>INFORMATION TECHNOLOGY</subject></subj-group></article-categories><title-group><article-title>Малоразмерные спектральные признаки для машинного обучения в задачах анализа и классификации голосового сигнала</article-title><trans-title-group xml:lang="en"><trans-title>Small-size spectral features for machine learning  in voice signal analysis and classification tasks</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Лихачёв</surname><given-names>Д. С.</given-names></name><name name-style="western" xml:lang="en"><surname>Likhachov</surname><given-names>D. S.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Лихачёв Денис Сергеевич, кандидат технических  наук, доцент кафедры электронных вычислительных средств</p><p>ул. П. Бровки, 6, Минск, 220013</p></bio><bio xml:lang="en"><p>Denis S. Likhachov, Ph. D. (Eng.), Assoc. Prof. of Computer Engineering Department</p><p>st. P. Brovki, 6, Minsk, 220013</p><p> </p></bio><email xlink:type="simple">likhachov@bsuir.by</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Вашкевич</surname><given-names>М. И.</given-names></name><name name-style="western" xml:lang="en"><surname>Vashkevich</surname><given-names>M. I.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Вашкевич Максим Иосифович, доктор технических наук, доцент кафедры электронных вычислительных средств</p><p>ул. П. Бровки, 6, Минск, 220013</p></bio><bio xml:lang="en"><p>Maxim I. Vashkevich, D. Sc. (Eng.), Assoc. Prof. of Computer Engineering Department</p><p>st. P. Brovki, 6, Minsk, 220013</p></bio><email xlink:type="simple">vashkevich@bsuir.by</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Петровский</surname><given-names>Н. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Petrovsky</surname><given-names>N. A.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Петровский Николай Александрович, кандидат технических наук, доцент кафедры электронных вычислительных средств</p><p>ул. П. Бровки, 6, Минск, 220013</p></bio><bio xml:lang="en"><p>Nick A. Petrovsky, Ph. D. (Eng.), Assoc. Prof. of  Computer Engineering Department</p><p>st. P. Brovki, 6, Minsk, 220013</p></bio><email xlink:type="simple">nick.petrovsky@bsuir.by</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Азаров</surname><given-names>И. С.</given-names></name><name name-style="western" xml:lang="en"><surname>Azarov</surname><given-names>E. S.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Азаров Илья Сергеевич, доктор технических наук, доцент, заведующий кафедрой электронных вычислительных средств</p><p>ул. П. Бровки, 6, Минск, 220013</p></bio><bio xml:lang="en"><p>Elias S. Azarov, D. Sc. (Eng.), Assoc. Prof., Head of Computer Engineering Department</p><p>st. P. Brovki, 6, Minsk, 220013</p></bio><email xlink:type="simple">azarov@bsuir.by</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Белорусский государственный университет  информатики и радиоэлектроники</institution></aff><aff xml:lang="en"><institution>Belarusian State University of Informatics and Radioelectronics</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2023</year></pub-date><pub-date pub-type="epub"><day>29</day><month>03</month><year>2023</year></pub-date><volume>20</volume><issue>1</issue><fpage>102</fpage><lpage>112</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Лихачёв Д.С., Вашкевич М.И., Петровский Н.А., Азаров И.С., 2023</copyright-statement><copyright-year>2023</copyright-year><copyright-holder xml:lang="ru">Лихачёв Д.С., Вашкевич М.И., Петровский Н.А., Азаров И.С.</copyright-holder><copyright-holder xml:lang="en">Likhachov D.S., Vashkevich M.I., Petrovsky N.A., Azarov E.S.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://inf.grid.by/jour/article/view/1234">https://inf.grid.by/jour/article/view/1234</self-uri><abstract><sec><title>Цели</title><p>Цели. Решается задача разработки метода вычисления малоразмерных спектральных признаков, повышающего эффективность существующих систем машинного обучения для анализа и классификации голосовых сигналов.</p></sec><sec><title>Методы</title><p>Методы. Спектральные признаки извлекаются с помощью генеративного подхода, который предполагает вычисление дискретного спектра Фурье последовательности отчетов, сгенерированной с использованием авторегрессионной модели входного голосового сигнала. Сгенерированная последовательность, обрабатываемая дискретным преобразованием Фурье, учитывает периодичность преобразования, позволяя тем самым повысить точность спектральной оценки анализируемого сигнала.</p></sec><sec><title>Результаты</title><p>Результаты. Предложен и описан генеративный метод вычисления спектральных признаков, предназначенных для применения в системах машинного обучения при анализе и классификации голосовых сигналов. Проведен экспериментальный анализ точности и стабильности представления спектра тестового сигнала с известным спектральным составом с использованием огибающих. Огибающие вычислялись с помощью предложенного генеративного метода и дискретного преобразования Фурье с различными окнами анализа (прямоугольным окном и окном Ханна). Проведенный анализ показал, что генеративный метод получения спектральных огибающих позволил добиться более точного представления спектра тестового сигнала по критерию минимума квадратичной ошибки. Проведено сравнение эффективности классификации голосового сигнала при использовании предложенных признаков и признаков на основе мел-частотных кепстральных коэффициентов. В качестве базовой тестовой системы для оценки эффективности предлагаемого подхода на практике использовалась система диагностики бокового амиотрофического склероза по голосу.</p></sec><sec><title>Заключение</title><p>Заключение. Результаты экспериментов показали ощутимое повышение точности классификации при использовании предлагаемых признаков по сравнению с признаками на основе мел-частотных кепстральных коэффициентов.</p></sec></abstract><trans-abstract xml:lang="en"><sec><title>Objectives</title><p>Objectives. The problem of developing a method for calculating small-sized spectral features that increases the efficiency of existing machine learning systems for analyzing and classifying voice signals is being solved.</p></sec><sec><title>Methods</title><p>Methods. Spectral features are extracted using a generative approach, which involves calculating a discrete Fourier spectrum for a sequence of samples generated using an autoregressive model of input voice signal. The generated sequence processed by the discrete Fourier transform considers the periodicity of the transform and thereby increase the accuracy of spectral estimation of analyzed signal.</p></sec><sec><title>Results</title><p>Results. A generative method for calculating spectral features intended for use in machine learning systems for the analysis and classification of voice signals is proposed and described. An experimental analysis of the  accuracy and stability of the spectrum representation of a test signal with a known spectral composition has been carried out using the envelopes. The envelopes were calculated using  proposed generative method and using discrete Fourier transform with different analysis windows (rectangular window and Hanna window).  The analysis showed that spectral envelopes obtained using the proposed method more accurately represent the spectrum of test signal according to the criterion of minimum square error. A comparison of the effectiveness of voice signal classification with proposed features and the features based on the mel-frequency kepstral  coefficients is carried out. A diagnostic system for amyotrophic lateral sclerosis was used as a basic test system to evaluate the effectiveness of proposed approach in practice. </p></sec><sec><title>Conclusion</title><p>Conclusion. The obtained experimental results showed a significant increase of classification accuracy when using proposed approach for calculating features compared with the features based on the mel-frequency kepstral coefficients.</p></sec></trans-abstract><kwd-group xml:lang="ru"><kwd>анализ голоса</kwd><kwd>генеративный метод</kwd><kwd>авторегрессия</kwd><kwd>машинное обучение</kwd><kwd>спектральные признаки</kwd><kwd>классификация</kwd></kwd-group><kwd-group xml:lang="en"><kwd>voice analysis</kwd><kwd>generative method</kwd><kwd>autoregression</kwd><kwd>machine learning</kwd><kwd>spectral features</kwd><kwd>classification</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Towards robust voice pathology detection / P. Harar [et al.] // Neural Computing and Applications. – 2020. – Vol. 32, no. 20. – P. 15747–15757.</mixed-citation><mixed-citation xml:lang="en">Harar P., Galaz Z., Alonso-Hernandez J. B., Mekyska J., Burget R., Smekal Z. Towards robust voice pathology detection. Neural Computing and Applications, 2020, vol. 32, no. 20, pp. 15747–15757.</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Benba, A. Discriminating between patients with Parkinson’s and neurological diseases using cepstral analysis / A. Benba, A. Jilbab, A. Hammouch // IEEE Transactions on Neural Systems and Rehabilitation Engineering. – 2016. – Vol. 24, no. 10. – P. 1100–1108.</mixed-citation><mixed-citation xml:lang="en">Benba A., Jilbab A., Hammouch A. Discriminating between patients with Parkinson’s and neurological diseases using cepstral analysis. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2016, vol. 24, no. 10, pp. 1100–1108.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Vashkevich, M. Classification of ALS patients based on acoustic analysis of sustained vowel phonations / M. Vashkevich, Y. Rushkevich // Biomedical Signal Processing and Control. – 2021. – Vol. 65. – P. 1–14.</mixed-citation><mixed-citation xml:lang="en">Vashkevich M., Rushkevich Y. Classification of ALS patients based on acoustic analysis of sustained vowel phonations. Biomedical Signal Processing and Control, 2021, vol. 65, pp. 1–14.</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Rabiner, L. R. Fundamentals of Speech Recognition / L. R. Rabiner, B. H. Juang. – Pearson Education, 1993. – 570 p.</mixed-citation><mixed-citation xml:lang="en">Rabiner L. R., Juang B. H. Fundamentals of Speech Recognition. Pearson Education, 1993, 570 p.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Harris, F. J. On the use of windows for harmonic analysis with the discrete Fourier transform / F. J. Harris // Proc. of the IEEE. – Jan. 1978. – Vol. 66, no. 1. – P. 51–83.</mixed-citation><mixed-citation xml:lang="en">Harris F. J. On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE, January 1978, vol. 66, no. 1, pp. 51–83. https://doi.org/10.1109/PROC.1978.10837</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Вашкевич, М. И. Система анализа и классификации голосового сигнала на основе пертрубационных параметров и кепстрального представления в психоакустических шкалах / М. И. Вашкевич, Д. С. Лихачёв, И. С. Азаров // Доклады БГУИР. – 2022. – Т. 20, № 4. – С. 73–82. https://doi.org/10.35596/1729-7648-2022-20-1-73-82</mixed-citation><mixed-citation xml:lang="en">Vashkevich M. I., Likhachov D. S., Azarov E. S. Voice analysis and classification system based on perturbation parameters and cepstral presentation in psychoacoustic scales. Doklady Belorusskogo gosudarstvennogo universiteta informatiki i radioèlektroniki [Reports of the Belarusian State University of Informatics and Radioelectronics], 2022, vol. 20, no. 1, pp. 73–82 (In Russ.). https://doi.org/10.35596/17297648-2022-20-1-73-82</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Markel, J. D. Linear Prediction of Speech / J. D. Markel, A. H. Gray. – Berlin, N. Y. : Springer-Verlag, 1976. – 290 p.</mixed-citation><mixed-citation xml:lang="en">Markel J. D., Gray A. H. Linear Prediction of Speech. Berlin, New York, Springer-Verlag, 1976, 290 p.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Flach, P. Machine Learning: The Art and Science of Algorithms That Make Sense of Data / P. Flach. – Cambridge University Press, 2012. – 416 p.</mixed-citation><mixed-citation xml:lang="en">Flach P. Machine Learning: The Art and Science of Algorithms That Make Sense of Data. Cambridge University Press, 2012, 416 p.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">An Introduction to Statistical Learning with Applications in R / G. James [et al.]. – Springer, 2013. – 440 p.</mixed-citation><mixed-citation xml:lang="en">James G., Witten D., Hastie T., Tibshirani R. An Introduction to Statistical Learning with Applications in R. Springer, 2013, 440 p.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Vashkevich, M. Bulbar ALS detection based on analysis of voice perturbation and vibrato / M. Vashkevich, A. Petrovsky, Y. Rushkevich // IEEE Intern. Conf. on Signal Processing: Algorithms, Architectures, Arrangements, and Applications, Poznan, Poland, 18–20 Sept. 2019. – Poznan, 2019. – P. 267–272.</mixed-citation><mixed-citation xml:lang="en">Vashkevich, M., Petrovsky A., Rushkevich Y. Bulbar ALS detection based on analysis of voice perturbation and vibrato. IEEE International Conference on Signal Processing: Algorithms, Architectures, Arrangements, and Applications, Poznan, Poland, 18–20 September 2019. Poznan, 2019, pp. 267–272.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">The necessity of leave one subject out (LOSO) cross validation for EEG disease diagnosis / S. Kunjan [et al.] // Brain Informatics. – Springer, 2021. – P. 558–567. https://doi.org/10.1007/978-3-030-86993-9_50</mixed-citation><mixed-citation xml:lang="en">Kunjan S., Grummett T. S., Pope K. J., Powers D. M. W., Fitzgibbon S. P., …, Lewis T. W. The necessity of leave one subject out (LOSO) cross validation for EEG disease diagnosis. Brain Informatics, Springer, 2021, pp. 558–567. https://doi.org/10.1007/978-3-030-86993-9_50</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
