<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">inform</journal-id><journal-title-group><journal-title xml:lang="ru">Информатика</journal-title><trans-title-group xml:lang="en"><trans-title>Informatics</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">1816-0301</issn><issn pub-type="epub">2617-6963</issn><publisher><publisher-name>UIIP NASB</publisher-name></publisher></journal-meta><article-meta><article-id custom-type="elpub" pub-id-type="custom">inform-15</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ОБРАБОТКА СИГНАЛОВ, ИЗОБРАЖЕНИЙ, РЕЧИ, ТЕКСТА И РАСПОЗНАВАНИЕ ОБРАЗОВ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>SIGNAL, IMAGE, SPEECH, TEXT PROCESSING AND PATTERN RECOGNITION</subject></subj-group></article-categories><title-group><article-title>ОБНАРУЖЕНИЕ ФРАГМЕНТОВ ТЕКСТА НА ИЗОБРАЖЕНИЯХ РЕАЛЬНЫХ СЦЕН НА БАЗЕ СВЕРТОЧНОЙ НЕЙРОСЕТЕВОЙ МОДЕЛИ</article-title><trans-title-group xml:lang="en"><trans-title>DETECTION OF TEXT OBJECTS IN IMAGES OF REAL SCENES BASED ON CONVOLUTIONAL NEURAL NETWORK MODEL</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Кузьмицкий</surname><given-names>Н. Н.</given-names></name><name name-style="western" xml:lang="en"><surname>Kuzmitsky</surname><given-names>N. N.</given-names></name></name-alternatives><email xlink:type="simple">knnbrest@yandex.ru</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff xml:lang="en" id="aff-1"><institution>Брестский государственный технический университет</institution><country>Belarus</country></aff><pub-date pub-type="collection"><year>2015</year></pub-date><pub-date pub-type="epub"><day>26</day><month>09</month><year>2016</year></pub-date><volume>0</volume><issue>2</issue><fpage>12</fpage><lpage>21</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Кузьмицкий Н.Н., 2016</copyright-statement><copyright-year>2016</copyright-year><copyright-holder xml:lang="ru">Кузьмицкий Н.Н.</copyright-holder><copyright-holder xml:lang="en">Kuzmitsky N.N.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://inf.grid.by/jour/article/view/15">https://inf.grid.by/jour/article/view/15</self-uri><abstract><p>Рассматривается модель детектора текстовых образов на базе сверточной нейронной сети, способной синтезировать высокоуровневые признаки образов в режиме «черного ящика». Описывается методика применения детектора, основанная на алгоритмах мультимасштабного сканирования и локальной интерпретации откликов, позволяющая обнаруживать текстовые объекты на изображениях реальных сцен. Показываются преимущества разработок в сравнении аналогами, выполняется оценка эффективности на примере известной базы данных.</p></abstract><trans-abstract xml:lang="en"><p>A model of text image detector based on a convolutional neural network architecture is presented, capable of synthesizing high-level features of images in the «black box» mode. An implementation of the detector application, based on algorithms of multi-scale scanning and local responses interpretation is described, allowing to find out text samples on images of real scenes. Advantages in comparison with analogs are shown and efficiency evaluation on an example of a known database is conducted.</p></trans-abstract></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Sumathi, C.P. A Survey on various approaches of text extraction in images / C.P. Sumathi,</mixed-citation><mixed-citation xml:lang="en">Sumathi, C.P. A Survey on various approaches of text extraction in images / C.P. Sumathi,</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">T. Santhanam, G. Gayathri // International Journal of Computer Science &amp; Engineering Survey. – 2012. – Vol. 3, № 4. – P. 27–42.</mixed-citation><mixed-citation xml:lang="en">T. Santhanam, G. Gayathri // International Journal of Computer Science &amp; Engineering Survey. – 2012. – Vol. 3, № 4. – P. 27–42.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">LeCun, Y. Gradient-Based Learning Applied to Document Recognition / Y. LeCun, L. Bottou // Proceedings of the IEEE. – 1998. – Vol. 86, № 11. – P. 2278–2324.</mixed-citation><mixed-citation xml:lang="en">LeCun, Y. Gradient-Based Learning Applied to Document Recognition / Y. LeCun, L. Bottou // Proceedings of the IEEE. – 1998. – Vol. 86, № 11. – P. 2278–2324.</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Кузьмицкий, Н.Н. Сверточная нейросетевая модель в задаче классификации изображений изолированных цифр / Н.Н. Кузьмицкий // Доклады БГУИР. – Минск, 2012. – № 7. – С. 64–70.</mixed-citation><mixed-citation xml:lang="en">Кузьмицкий, Н.Н. Сверточная нейросетевая модель в задаче классификации изображений изолированных цифр / Н.Н. Кузьмицкий // Доклады БГУИР. – Минск, 2012. – № 7. – С. 64–70.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Головко, В.А. Нейронные сети: обучение, организация и применение : учеб. пособие / В.А. Головко. – М. : ИПРЖР, 2001. – Кн. 4. – 256 с.</mixed-citation><mixed-citation xml:lang="en">Головко, В.А. Нейронные сети: обучение, организация и применение : учеб. пособие / В.А. Головко. – М. : ИПРЖР, 2001. – Кн. 4. – 256 с.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Осовский, С. Нейронные сети для обработки информации / С. Осовский. – М. : Финансы и статистика, 2002. – 344 с.</mixed-citation><mixed-citation xml:lang="en">Осовский, С. Нейронные сети для обработки информации / С. Осовский. – М. : Финансы и статистика, 2002. – 344 с.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Delakis, М. Text detection with convolutional neural networks / М. Delakis, Сr. Garcia // Intern. Conf. on Computer Vision Theory and Applications. – Cambridge, 2008. – P. 290–294.</mixed-citation><mixed-citation xml:lang="en">Delakis, М. Text detection with convolutional neural networks / М. Delakis, Сr. Garcia // Intern. Conf. on Computer Vision Theory and Applications. – Cambridge, 2008. – P. 290–294.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Wang, K. End-to-end scene text recognition / K. Wang, B. Babenko, S. Belongie // IEEE Intern. Conf. on Computer Vision (ICCV). – Barcelona, 2011. – P. 1457–1464.</mixed-citation><mixed-citation xml:lang="en">Wang, K. End-to-end scene text recognition / K. Wang, B. Babenko, S. Belongie // IEEE Intern. Conf. on Computer Vision (ICCV). – Barcelona, 2011. – P. 1457–1464.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">ICDAR 2003 robust reading competitions / S.M. Lucas [et al.] // Proc. of Seventh Intern. Conf. on Document Analysis and Recognition. – Edinburgh, 2003. – P. 682–687.</mixed-citation><mixed-citation xml:lang="en">ICDAR 2003 robust reading competitions / S.M. Lucas [et al.] // Proc. of Seventh Intern. Conf. on Document Analysis and Recognition. – Edinburgh, 2003. – P. 682–687.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Campos, T.E. Character Recognition in Natural Images / T.E. Campos, B.R. Babu // VISAPP. – 2009. – Vol. 2. – P. 273–280.</mixed-citation><mixed-citation xml:lang="en">Campos, T.E. Character Recognition in Natural Images / T.E. Campos, B.R. Babu // VISAPP. – 2009. – Vol. 2. – P. 273–280.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Touch TT : Scene text extractor using touchscreen interface / J. Jung [et al.] // ETRI Journal. – 2011. – Vol. 33, № 1. – P. 78–88.</mixed-citation><mixed-citation xml:lang="en">Touch TT : Scene text extractor using touchscreen interface / J. Jung [et al.] // ETRI Journal. – 2011. – Vol. 33, № 1. – P. 78–88.</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">The Street View House Numbers (SVHN) Dataset [Electronic resource]. – 2011. – Mode of access : http://ufldl.stanford.edu/housenumbers. – Date of access : 03.07.2014.</mixed-citation><mixed-citation xml:lang="en">The Street View House Numbers (SVHN) Dataset [Electronic resource]. – 2011. – Mode of access : http://ufldl.stanford.edu/housenumbers. – Date of access : 03.07.2014.</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Ikica, A. An improved edge profile based method for text detection in images of natural scenes / A. Ikica, P. Peer // Intern. Conf. on Computer as a Tool (EUROCON). – Lisbon, 2011. – P. 1–4.</mixed-citation><mixed-citation xml:lang="en">Ikica, A. An improved edge profile based method for text detection in images of natural scenes / A. Ikica, P. Peer // Intern. Conf. on Computer as a Tool (EUROCON). – Lisbon, 2011. – P. 1–4.</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">ICDAR 2013 Robust Reading Competitio / D. Karatzas [et al.] // Proc. 12th Intern. Conf. of Document Analysis and Recognition, IEEE CPS. – Washington, 2013. – P. 1115–1124.</mixed-citation><mixed-citation xml:lang="en">ICDAR 2013 Robust Reading Competitio / D. Karatzas [et al.] // Proc. 12th Intern. Conf. of Document Analysis and Recognition, IEEE CPS. – Washington, 2013. – P. 1115–1124.</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Wolf, C. Object Count Area Graphs for the Evaluation of Object Detection and Segmentation Algorithms / C. Wolf, J.M. Jolion // International Journal of Document Analysis. – 2006. – Vol. 8, № 4. – P. 280–296.</mixed-citation><mixed-citation xml:lang="en">Wolf, C. Object Count Area Graphs for the Evaluation of Object Detection and Segmentation Algorithms / C. Wolf, J.M. Jolion // International Journal of Document Analysis. – 2006. – Vol. 8, № 4. – P. 280–296.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
