<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">inform</journal-id><journal-title-group><journal-title xml:lang="ru">Информатика</journal-title><trans-title-group xml:lang="en"><trans-title>Informatics</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">1816-0301</issn><issn pub-type="epub">2617-6963</issn><publisher><publisher-name>UIIP NASB</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.37661/1816-0301-2022-19-3-74-85</article-id><article-id custom-type="elpub" pub-id-type="custom">inform-1207</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ОБРАБОТКА СИГНАЛОВ, ИЗОБРАЖЕНИЙ, РЕЧИ, ТЕКСТА И РАСПОЗНАВАНИЕ ОБРАЗОВ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>SIGNAL, IMAGE, SPEECH, TEXT PROCESSING AND PATTERN RECOGNITION</subject></subj-group></article-categories><title-group><article-title>Распознавание изображений товаров электронной коммерции с использованием модели внимания и нейронной сети YOLACT</article-title><trans-title-group xml:lang="en"><trans-title>E-commerce image recognition using attention model and YOLACT neural network</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-2128-1943</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Сорокина</surname><given-names>В. В.</given-names></name><name name-style="western" xml:lang="en"><surname>Sorokina</surname><given-names>V. V.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Сорокина Виктория Вадимовна, аспирант кафедры веб-технологий и компьютерного моделирования механико-математического факультета</p><p>пр. Независимости, 4, Минск, 220050</p></bio><bio xml:lang="en"><p>Viktoria V. Sorokina, Postgraduate Student of WebTechnologies and Computer Modeling Department of Mechanics and Mathematics Faculty </p><p>av. Nezavisimosti, 4, Minsk, 220050</p></bio><email xlink:type="simple">viktoria.sorokina.96@gmail.com</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-9404-1206</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Абламейко</surname><given-names>С. В.</given-names></name><name name-style="western" xml:lang="en"><surname>Ablameyko</surname><given-names>S. V.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Абламейко Сергей Владимирович, академик НАН Беларуси, доктор технических наук, профессор, лауреат Государственной премии Республики Беларусь, заслуженный деятель науки Республики Беларусь </p><p>пр. Независимости, 4, Минск, 220050</p><p>ул. Сурганова, 6, Минск, 220012</p></bio><bio xml:lang="en"><p>Sergey V. Ablameyko, Academician of the National Academy of Sciences of Belarus, D. Sc. (Eng.), Professor, Laureate of the State Prize of the Republic of Belarus, Honored Scientist of the Republic of Belarus</p><p>av. Nezavisimosti, 4, Minsk, 220050</p><p>st. Surganova, 6, Minsk, 220012</p></bio><email xlink:type="simple">ablameyko@bsu.by</email><xref ref-type="aff" rid="aff-2"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Белорусский государственный университет</institution></aff><aff xml:lang="en"><institution>Belarusian State University</institution></aff></aff-alternatives><aff-alternatives id="aff-2"><aff xml:lang="ru"><institution>Белорусский государственный университет; Объединенный институт проблем информатики Национальной академии наук Беларуси</institution></aff><aff xml:lang="en"><institution>Belarusian State University; The United Institute of Informatics Problems of the National Academy of Sciences of Belarus</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2022</year></pub-date><pub-date pub-type="epub"><day>22</day><month>08</month><year>2022</year></pub-date><volume>19</volume><issue>3</issue><fpage>74</fpage><lpage>85</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Сорокина В.В., Абламейко С.В., 2022</copyright-statement><copyright-year>2022</copyright-year><copyright-holder xml:lang="ru">Сорокина В.В., Абламейко С.В.</copyright-holder><copyright-holder xml:lang="en">Sorokina V.V., Ablameyko S.V.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://inf.grid.by/jour/article/view/1207">https://inf.grid.by/jour/article/view/1207</self-uri><abstract><p>Цели. Предлагается алгоритм распознавания изображений товаров электронной коммерции с использованием модели внимания и нейронной сети YOLACT. Целью работы является улучшение взаимодействия между перекрестными признаками изображения с помощью модульной архитектуры, в которой применяется модель внимания к разным веткам сети.Методы. Основными методами распознавания изображений товаров электронной коммерции являются создание и аннотация набора данных для обучения нейронной сети, выбор архитектуры и встраивание модели внимания, валидация и проведение тестов, а также интерпретация результатов.Результаты. Сверточная нейронная сеть YOLACT модифицировалась моделью внимания для решения задачи распознавания объектов электронной коммерции, что позволило получить более качественные результаты, чем у классической сети YOLACT.Заключение. В ходе эксперимента был подготовлен набор данных товаров электронной коммерции, произведена его аннотация, построены две нейронные сети для сравнения результатов. Результаты исследования показали, что использование модели внимания положительно влияет как на качество обученной сети, так и на скорость сходимости. Это отражается в улучшенных метриках для распознавания и сегментации объектов.</p></abstract><trans-abstract xml:lang="en"><p>Objectives. We propose the algorithm for e-commerce image recognition using attention model and neural network YOLACT. A modular architecture is used that applies an attention model to different branches of the network in order to improve the interaction between image cross-features.Methods. The main methods to recognize e-commerce products are the creation and annotation of a dataset for the neural network training, the choice of architecture and embedding an attention model, the validation and testing, and interpretation of the results.Results. Convolutional neural network YOLACT has been modified by the attention model to solve image recognition task that allowed to obtain results superior in quality to the results showed by classic YOLACT.Conclusion. In the course of the experiment, a data set of e-commerce products was prepared, annotated, and two neural networks were built to compare the results. The results of the study showed that the use of the attention model has a positive effect on both the quality of the trained network and on the rate of convergence, which is reflected in improved metrics for object recognition and segmentation.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>распознавание объектов</kwd><kwd>сверточная нейронная сеть</kwd><kwd>модель внимания</kwd><kwd>сеть YOLACT</kwd><kwd>электронная коммерция</kwd></kwd-group><kwd-group xml:lang="en"><kwd>object recognition</kwd><kwd>convolutional neural network</kwd><kwd>attention model</kwd><kwd>network YOLACT</kwd><kwd>e-commerce</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Bolya D., Zhou C., Xiao F., Lee Y. J. YOLACT: Real-time instance segmentation. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 27 October – 2 November 2019, pp. 9157–9166.</mixed-citation><mixed-citation xml:lang="en">Bolya D., Zhou C., Xiao F., Lee Y. J. YOLACT: Real-time instance segmentation. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 27 October – 2 November 2019, pp. 9157–9166.</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Bahdanau D., Cho K., Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate. 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. Available at: https://arxiv.org/abs/1409.0473?context=stat (accessed 01.02.2021).</mixed-citation><mixed-citation xml:lang="en">Bahdanau D., Cho K., Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate. 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. Available at: https://arxiv.org/abs/1409.0473?context=stat (accessed 01.02.2021).</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Chaudhuri A., Messina P., Kokkula S., Subramanian A., Krishnan A., …, Kandaswamy V. A smart system for selection of optimal product images in e-commerce. IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018, pp. 1728–1736.</mixed-citation><mixed-citation xml:lang="en">Chaudhuri A., Messina P., Kokkula S., Subramanian A., Krishnan A., …, Kandaswamy V. A smart system for selection of optimal product images in e-commerce. IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018, pp. 1728–1736.</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Zhang X. Content-based e-commerce image classification research. IEEE Access, 2020, vol. 8, pp. 160213–160220.</mixed-citation><mixed-citation xml:lang="en">Zhang X. Content-based e-commerce image classification research. IEEE Access, 2020, vol. 8, pp. 160213–160220.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Bossard L., Dantone M., Leistner C., Wengert C., Quack T., Van Gool L. Apparel classification with style. Asian Conference on Computer Vision, Berlin, 2012, vol. 7727, рр. 321–335.</mixed-citation><mixed-citation xml:lang="en">Bossard L., Dantone M., Leistner C., Wengert C., Quack T., Van Gool L. Apparel classification with style. Asian Conference on Computer Vision, Berlin, 2012, vol. 7727, рр. 321–335.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Lao B., Jagadeesh K. Convolutional neural networks for fashion classification and object detection. CCCV 2015 Computer Vision, рр. 120–129.</mixed-citation><mixed-citation xml:lang="en">Lao B., Jagadeesh K. Convolutional neural networks for fashion classification and object detection. CCCV 2015 Computer Vision, рр. 120–129.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Dai J., He K., Li Y., Ren S., Sun J. Instance-sensitive fully convolutional networks. 14th European Conference on Computer Vision, Amsterdam, 11–14 October 2016, vol. 9910, рр. 534–549.</mixed-citation><mixed-citation xml:lang="en">Dai J., He K., Li Y., Ren S., Sun J. Instance-sensitive fully convolutional networks. 14th European Conference on Computer Vision, Amsterdam, 11–14 October 2016, vol. 9910, рр. 534–549.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">He K., Zhang X., Ren S., Sun J. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016, 2016, рр. 770–778.</mixed-citation><mixed-citation xml:lang="en">He K., Zhang X., Ren S., Sun J. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016, 2016, рр. 770–778.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Green B. Canny Edge Detecor. Available at: https://docs.opencv.org/master/da/d22/tutorial_py_canny.html (accessed 01.02.2021).</mixed-citation><mixed-citation xml:lang="en">Green B. Canny Edge Detecor. Available at: https://docs.opencv.org/master/da/d22/tutorial_py_canny.html (accessed 01.02.2021).</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Pech-Pacheco J. L., Cristobal G., Chamorro-Martinez J., Fernandez-Valdivia J. Diatom Autofocusing in Brightfield Microscopy: A Comparative Study. Available at: http://optica.csic.es/papers/icpr2k.pdf (accessed 01.02.2021).</mixed-citation><mixed-citation xml:lang="en">Pech-Pacheco J. L., Cristobal G., Chamorro-Martinez J., Fernandez-Valdivia J. Diatom Autofocusing in Brightfield Microscopy: A Comparative Study. Available at: http://optica.csic.es/papers/icpr2k.pdf (accessed 01.02.2021).</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">He K. Mask R-CNN. IEEE International Conference on Computer Vision (ICCV), Venice, 22–29 October 2017, рр. 2980–2988.</mixed-citation><mixed-citation xml:lang="en">He K. Mask R-CNN. IEEE International Conference on Computer Vision (ICCV), Venice, 22–29 October 2017, рр. 2980–2988.</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Qi H., Dai J., Ji X., Wei Y. Fully convolutional instance-aware semantic segmentation. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21–26 July 2017, рр. 4438–4446.</mixed-citation><mixed-citation xml:lang="en">Qi H., Dai J., Ji X., Wei Y. Fully convolutional instance-aware semantic segmentation. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21–26 July 2017, рр. 4438–4446.</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Sorokina V., Ablameyko S. Neural network training acceleration by weight standardization in segmentation of electronic commerce images. Studies in Computational Intelligence, 2020, vol. 976, рр. 237–244.</mixed-citation><mixed-citation xml:lang="en">Sorokina V., Ablameyko S. Neural network training acceleration by weight standardization in segmentation of electronic commerce images. Studies in Computational Intelligence, 2020, vol. 976, рр. 237–244.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
