<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">inform</journal-id><journal-title-group><journal-title xml:lang="ru">Информатика</journal-title><trans-title-group xml:lang="en"><trans-title>Informatics</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">1816-0301</issn><issn pub-type="epub">2617-6963</issn><publisher><publisher-name>UIIP NASB</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.37661/1816-0301-2020-17-2-7-16</article-id><article-id custom-type="elpub" pub-id-type="custom">inform-1056</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ОБРАБОТКА СИГНАЛОВ, ИЗОБРАЖЕНИЙ, РЕЧИ, ТЕКСТА И РАСПОЗНАВАНИЕ ОБРАЗОВ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>SIGNAL, IMAGE, SPEECH, TEXT PROCESSING AND PATTERN RECOGNITION</subject></subj-group></article-categories><title-group><article-title>Обнаружение объектов на изображениях с большим разрешением на основе их пирамидально-блочной обработки</article-title><trans-title-group xml:lang="en"><trans-title>Object detection in high resolution images based on multiscale and block processing</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-6609-5810</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Богуш</surname><given-names>Р. П.</given-names></name><name name-style="western" xml:lang="en"><surname>Bohush</surname><given-names>R. P.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Богуш Рихард Петрович, кандидат технических наук, доцент, заведующий кафедрой вычислительных систем и сетей, факультет информационных технологий</p><p>Новополоцк</p></bio><bio xml:lang="en"><p>Rykhard P. Bohush, Cand. Sci. (Eng.), Associate Professor, Head of the Department of Computer Systems and Networks</p><p>Novopolotsk</p></bio><email xlink:type="simple">bogushr@mail.ru</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-7580-7878</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Захарова</surname><given-names>И. Ю.</given-names></name><name name-style="western" xml:lang="en"><surname>Zakharava</surname><given-names>I. Yu.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Захарова Ирина Юрьевна, магистр технических наук, аспирант кафедры вычислительных систем и сетей, факультет информационных технологий</p><p>Новополоцк</p></bio><bio xml:lang="en"><p>Iryna Yu. Zakharava, M. Sci. (Eng.), Postgraduate Student at the Department of Computer Systems and Networks</p><p>Novopolotsk</p></bio><email xlink:type="simple">ira9992011@yandex.ru</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-9404-1206</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Абламейко</surname><given-names>С. В.</given-names></name><name name-style="western" xml:lang="en"><surname>Ablameyko</surname><given-names>S. V.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Абламейко Сергей Владимирович, академик Национальной академии наук Беларуси, доктор технических наук, профессор, профессор механико-математического факультета</p><p>Минск</p></bio><bio xml:lang="en"><p>Sergey V. Ablameyko, Academician of the National Academy of Sciences of Belarus, Dr. Sci. (Eng.), Professor, Professor of the Faculty of Mechanics and Mathematics</p><p>Minsk</p></bio><email xlink:type="simple">ablameyko@bsu.by</email><xref ref-type="aff" rid="aff-2"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Полоцкий государственный университет</institution></aff><aff xml:lang="en"><institution>Polotsk State University</institution></aff></aff-alternatives><aff-alternatives id="aff-2"><aff xml:lang="ru"><institution>Белорусский государственный университет; Объединенный институт проблем информатики Национальной академии наук Беларуси</institution></aff><aff xml:lang="en"><institution>Belarusian State University; The United Institute of Informatics Problems of the National Academy of Sciences of Belarus</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2020</year></pub-date><pub-date pub-type="epub"><day>22</day><month>04</month><year>2020</year></pub-date><volume>17</volume><issue>2</issue><fpage>7</fpage><lpage>16</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Богуш Р.П., Захарова И.Ю., Абламейко С.В., 2020</copyright-statement><copyright-year>2020</copyright-year><copyright-holder xml:lang="ru">Богуш Р.П., Захарова И.Ю., Абламейко С.В.</copyright-holder><copyright-holder xml:lang="en">Bohush R.P., Zakharava I.Y., Ablameyko S.V.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://inf.grid.by/jour/article/view/1056">https://inf.grid.by/jour/article/view/1056</self-uri><abstract><p>Предлагается алгоритм для обнаружения объектов на изображениях с большим разрешением, основанный на многомасштабном представлении изображения, пирамидально-блочной обработке с перекрытием, применении сверточной нейронной сети для каждого блока и объединении обнаруженных областей. Количество слоев пирамиды определяется размерами изображения и входного слоя используемой сверточной нейронной сети. На всех уровнях, кроме самого верхнего, выполняется блочное разбиение, а применение при этом перекрытия позволяет улучшить правильную классификацию объектов, которые разделяются на фрагменты и расположены в соседних блоках. Решение об объединении таких областей принимается на основе анализа метрики пересечения над объединением для них и принадлежности к одному классу. Представленные результаты тестирования алгоритма подтверждают, что рассмотренный подход позволяет повысить точность обнаружения объектов небольших размеров на изображениях с большим разрешением.</p></abstract><trans-abstract xml:lang="en"><p>In the paper the algorithm for object detection in high resolution images is proposed. The approach uses multiscale image representation followed by block processing with the overlapping value. For each block the object detection with convolutional neural network was performed. Number of pyramid layers is limited by the Convolutional Neural Network layer size and input image resolution. Overlapping blocks splitting to improve the classification and detection accuracy is performed on each layer of pyramid except the highest one. Detected areas are merged into one if they have high overlapping value and the same class. Experimental results for the algorithm are presented in the paper.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>сверточная нейронная сеть</kwd><kwd>блочная обработка</kwd><kwd>разрешение 4К</kwd><kwd>обнаружение объектов</kwd><kwd>многомасштабное представление изображения</kwd></kwd-group><kwd-group xml:lang="en"><kwd>convolutional neural network</kwd><kwd>block processing</kwd><kwd>4К resolution</kwd><kwd>object detection</kwd><kwd>multiscale image representation</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Дворкович, А. В. Метрологическое обеспечение видеоинформационных систем / А. В. Дворкович, В. П. Дворкович. – М. : Техносфера, 2015. – 784 с.</mixed-citation><mixed-citation xml:lang="en">Dvorkovich A. V., Dvorkovich V. P. Metrologicheskoe obespechenie videoinformatsionnykh system. Metrological Support of Video Information Systems, Moscow, Technosphera, 2015, 784 p. (in Russian).</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Goulekas, K. Visual Effects in a Digital World: a Comprehensive Glossary of over 7,000 Visual Effects Terms / K. Goulekas. – San Francisco : Morgan Kaufmann, 2001. – 600 p.</mixed-citation><mixed-citation xml:lang="en">Goulekas K. Visual Effects in a Digital World: a Comprehensive Glossary of over 7,000 Visual Effects Terms. San Francisco, Morgan Kaufmann, 2001, 600 p.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">An effective object detection algorithm for high resolution video by using convolutional neural network / D. Vorobjov [et al.] // Advances in Neural Networks-ISNN2018. Lecture Notes in Computer Science. – 2018. – Vol. 10878. – P. 503–510. https://doi.org/10.1007/978-3-319-92537-0_58</mixed-citation><mixed-citation xml:lang="en">Vorobjov D., Zakharova I., Bohush R., Ablameyko S. An effective object detection algorithm for high resolution video by using convolutional neural network. Advances in Neural Networks-ISNN2018. Lecture Notes in Computer Science, 2018, vol. 10878, pp. 503–510. https://doi.org/10.1007/978-3-319-92537-0_58</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Yongxi, L. Efficient object detection for high resolution images / L. Yongxi, T. Javidi // Proc. of 53 rd Annual Allerton Conf. on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 30 Sept. – 2 Oct. 2015. – Monticello, 2015. – P. 1091–1098. https://doi.org/10.1109/ALLERTON.2015.7447130</mixed-citation><mixed-citation xml:lang="en">Yongxi L., Javidi T. Efficient object detection for high resolution images. Proceedings of 53 rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 30 September – 2 October 2015. Monticello, 2015, pp. 1091–1098. https://doi.org/10.1109/ALLERTON.2015.7447130</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Sun database: large-scale scene recognition from abbey to zoo / J. Xiao [et al.] // Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, 13–18 June 2010. – San Francisco, 2010. – P. 3485–3492. https://doi.org/10.1109/CVPR.2010.5539970</mixed-citation><mixed-citation xml:lang="en">Xiao J., Hays J., Ehinger K., Oliva A., Torralba A. Sun database: large-scale scene recognition from abbey to zoo. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, 13–18 June 2010. San Francisco, 2010, pp. 3485–3492. https://doi.org/10.1109/CVPR.2010.5539970</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Ruzicka, V. Fast and accurate object detection in high resolution 4K and 8K video using GPUs / V. Ruzicka, F. Franchetti // Proc. of 2018 IEEE High Performance Extreme Computing Conf. (HPEC), Waltham, MA, USA, 25–27 Sept. 2018. – Waltham, 2018. – P. 1–7. https://doi.org/10.1109/HPEC.2018.8547574</mixed-citation><mixed-citation xml:lang="en">Ruzicka V., Franchetti F. Fast and accurate object detection in high resolution 4K and 8K video using GPUs. Proceedings of 2018 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 25–27 September 2018. Waltham, 2018, pp. 1–7. https://doi.org/10.1109/HPEC.2018.8547574</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Korshunov, P. UHD video dataset for evaluation of privacy / P. Korshunov, T. Ebrahimi // Proc. of Sixth Intern. Workshop on Quality of Multimedia Experience (QoMEX), Singapore, 18–20 Sept. 2014. – Singapore, 2014. – P. 232–237. https://doi.org/10.1109/QoMEX.2014.6982324</mixed-citation><mixed-citation xml:lang="en">Korshunov P., Ebrahimi T. UHD video dataset for evaluation of privacy. Proceedings of Sixth International Workshop on Quality of Multimedia Experience (QoMEX), Singapore, 18–20 September 2014. Singapore, 2014, pp. 232–237. https://doi.org/10.1109/QoMEX.2014.6982324</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Unel, F. O. The power of tiling for small object detection / F. O. Unel, B. Ozkalayci, C. Çigla // CVPR Workshops [Electronic resource]. – 2019. – Mode of access: http://openaccess.thecvf.com/content_CVPRW_2019/papers/UAVision/Unel_The_Power_of_Tiling_for_Small_Object_Detection_CVPRW_2019_paper.pdf. – Date of access: 18.01.2020.</mixed-citation><mixed-citation xml:lang="en">Unel F. O., Ozkalayci B., Çigla C. The power of tiling for small object detection. CVPR Workshops, 2019. Available at: http://openaccess.thecvf.com/content_CVPRW_2019/papers/UAVision/Unel_The_Power_of_Tiling_for_Small_Object_Detection_CVPRW_2019_paper.pdf. (accessed 18.01.2020).</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Region-based convolutional networks for accurate object detection and segmentation / R. Girshick [et al.] // IEEE Transactions on Pattern Analysis and Machine Intelligence. – 2016. – Vol. 38. – P. 142–158. https://doi.org/10.1109/TPAMI.2015.2437384</mixed-citation><mixed-citation xml:lang="en">Girshick R., Donahue J., Darrell T., Malik J. Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, vol. 38, pp. 142–158. https://doi.org/10.1109/TPAMI.2015.2437384</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Deep residual learning for image recognition / K. He [et al.] // Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, NV, USA, 27–30 June 2016. – Las Vegas, 2016. – P. 770–778. https://doi.org/10.1109/CVPR.2016.90</mixed-citation><mixed-citation xml:lang="en">He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, NV, USA, 27–30 June 2016. Las Vegas, 2016, pp. 770–778. https://doi.org/10.1109/CVPR.2016.90</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">You only look once: unified, real-time object detection / J. Redmon [et al.] // Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, NV, USA, 27–30 June 2016. – Las Vegas, 2016. – P. 779–788. https://doi.org/10.1109/CVPR.2016.91</mixed-citation><mixed-citation xml:lang="en">Redmon J., Divvala S. K., Girshick R. B., Farhadi A. You only look once: unified, real-time object detection. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, NV, USA, 27–30 June 2016. Las Vegas, 2016, pp.779–788. https://doi.org/10.1109/CVPR.2016.91</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Girshick, R. Fast R-CNN / R. Girshick // Proc. of IEEE Intern. Conf. on Computer Vision (ICCV), Santiago, Chile, 11–18 Dec. 2015. – Santiago, 2015. – P. 1440–1448. https://doi.org/10.1109/ICCV.2015.169</mixed-citation><mixed-citation xml:lang="en">Girshick R. Fast R-CNN. Proceedings of IEEE Intern. Conf. on Computer Vision (ICCV), Santiago, Chile, 11–18 December 2015. Santiago, 2015, pp. 1440–1448. https://doi.org/10.1109/ICCV.2015.169</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Faster R-CNN: towards real-time object detection with region proposal networks / S. Ren [et al.] // IEEE Transactions on Pattern Analysis and Machine Intelligence. – 2015. – Vol. 39, no. 6. – P. 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031</mixed-citation><mixed-citation xml:lang="en">Ren S., He K., Girshick R., Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, vol. 39, no. 6, pp. 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Глубокое обучение для детектирования объектов на изображениях документов / А. А. Крощенко и др. // Вестник БрГТУ. Физика, математика, информатика. – 2017. – № 5(107). – С. 2–9.</mixed-citation><mixed-citation xml:lang="en">Kroshchenko A., Golovko V., Bezobrazov S., Mikhno E., Khatskevich M, …, Brich A. Glubokoe obuchenie dlia detektirovaniia obieektov na izobrazheniiakh dokumentov [Deep training for detecting of objects at images of documents]. Vestnik Brestskogo gosudarstvennogo tekhnicheskogo universiteta. Fizika, matematika, informatika [Bulletin of the Brest State Technical University. Physics, mathematics, Computer Science], 2017, vol. 5 (107), pp. 2–9 (in Russian).</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Inception-v4, inception-ResNet and the impact of residual connections on learning / C. Szegedy [et al.] // Proc. of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), San Francisco, California, USA, 4–9 Febr. 2017. – San Francisco, 2017. – P. 4278–4284.</mixed-citation><mixed-citation xml:lang="en">Szegedy C., Ioffe S., Vanhoucke V. Inception-v4, inception-ResNet and the impact of residual connections on learning. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), San Francisco, California, USA, 4–9 February 2017. San Francisco, 2017, pp. 4278–4284.</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">The pascal Visual Object Classes (VOC) challenge / M. Everingham [et al.] // Intern. J. of Computer Vision. – 2010. – Vol. 88. – P. 303–338. https://doi.org/10.1007/s11263-009-0275-4</mixed-citation><mixed-citation xml:lang="en">Everingham M., Van Gool L., Williams C., Winn J., Zisserman A. The pascal Visual Object Classes (VOC) challenge. International Journal of Computer Vision, 2010, vol. 88, pp. 303–338. https://doi.org/10.1007/s11263-009-0275-4</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
