Цели

inform

Информатика

Informatics

1816-03012617-6963

UIIP NASB

10.37661/1816-0301-2023-20-1-40-54

inform-1225

Research Article

ИНТЕЛЛЕКТУАЛЬНЫЕ СИСТЕМЫ

INTELLIGENT SYSTEMS

Увеличение точности реидентификации людей на основе двухэтапного обучения сверточных нейронных сетей и аугментации

Improving person re-identification based on two-stage training of convolutional neural networks and augmentation

https://orcid.org/0000-0002-9780-5731

Игнатьева

С. А.

Ihnatsyeva

S. A.

Игнатьева Светлана Александровна, магистр тех- нических наук, аспирант кафедры вычислительных систем и сетей

ул. Блохина, 29, Новополоцк, 211440

Sviatlana A. Ihnatsyeva, M. Sc. (Eng.), Postgraduate Student of the Department of Computing Systems and Networks

st. Blokhina, 29, Novopolotsk, 211440

s.ignatieva@psu.by

https://orcid.org/0000-0002-6609-5810

Богуш

Р. П.

Bohush

R. P.

Богуш Рихард Петрович, доктор технических наук, доцент, заведующий кафедрой вычислительных систем и сетей

ул. Блохина, 29, Новополоцк, 211440

Rykhard P. Bohush, D. Sc. (Eng.), Assoc. Prof., Head of the Department of Computing Systems and Networks

st. Blokhina, 29, Novopolotsk, 211440

r.bogush@psu.by

Полоцкий государственный университет имени Евфросинии ПолоцкойEuphrosyne Polotskaya State University of Polotsk

2023

29032023

2014054

2023

Игнатьева С.А., Богуш Р.П.

Ihnatsyeva S.A., Bohush R.P.

Данная работа распространяется под лицензией Creative Commons Attribution 4.0.

This work is licensed under a Creative Commons Attribution 4.0 License.

https://inf.grid.by/jour/article/view/1225

Цели

Цели. Основной целью является повышение точности повторной идентификации людей в распределенных системах видеонаблюдения.

Методы

Методы. Используются методы машинного обучения.

Результаты

Результаты. Представлена технология двухэтапного обучения сверточных нейронных сетей (СНС), отличающаяся использованием аугментации изображений для предварительного этапа и точной настройки весовых коэффициентов на основе исходного набора изображений. На первом этапе обучение осуществляется на аугментированных данных, затем выполняется точная настройка СНС на исходных изображениях, что способствует повышению эффективности ре-идентификации за счет уменьшения потерь при обучении. Использование на двух этапах разных данных не позволяет СНС запоминать тренировочные примеры, тем самым предотвращая переобучение.

Предложенный метод расширения набора данных для обучения отличается тем, что совмещает циклический сдвиг пикселей изображения, исключение цветности и замещение фрагмента уменьшенной копией другого из пакета, подаваемого на вход СНС. Данный метод аугментации позволяет увеличить разнообразие обучающих данных, что повышает робастность СНС ко многим факторам: перекрытию людей, изменению освещенности, уменьшению разрешения изображения, зависимости от местоположения отличительных особенностей объекта интереса.

Заключение

Заключение. Применение технологии двухэтапного обучения и предложенного метода аугментации данных позволило повысить точность повторной идентификации людей для разных СНС и наборов данных в метриках: Rank1 на 4% – 21%; mAP на 10% – 31%; mINP на 39% – 60%.

Objectives

Objectives. The main goal is to improve person re-identification accuracy in distributed video surveillance systems.

Methods

Methods. Machine learning methods are applied.

Result

Result. A technology for two-stage training of convolutional neural networks (CNN) is presented, characterized by the use of image augmentation for the preliminary stage and fine tuning of weight coefficients based on the original images set for training. At the first stage, training is carried out on augmented data, at the second stage, fine tuning of the CNN is performed on the original images, which allows minimizing the losses and increasing model efficiency. The use of different data at different training stages does not allow the CNN to remember training examples, thereby preventing overfitting.

Proposed method as expanding the training sample differs as it combines an image pixels cyclic shift, color exclusion and fragment replacement with a reduced copy of another image. This augmentation method allows to get a wide variety of training data, which increases the CNN robustness to occlusions, illumination, low image resolution, dependence on the location of features.

Conclusion

Conclusion. The use of two-stage learning technology and the proposed data augmentation method made it possible to increase the person re-identification accuracy for different CNNs and datasets: in the Rank1 metric by 4–21 %; in the mAP by 10–31 %; in the mINP by 39–60 %.

Ре-идентификация людейсверточные нейронные сетипредварительное обучениеточная настройкарасширение обучающей выборки

Person re-identificationconvolutional neural networkpre-trainfine tuningaugmentation

References1

ImageNet: A large-scale hierarchical image database / J. Deng [et al.] // 2009 IEEE Conf. on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009. – Miami, 2009. – P. 248–255. https://doi.org/10.1109/CVPR.2009.5206848

Deng J., Dong W., Socher R., Li L., …, Fei-Fei L. ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009. Miami, 2009, pр. 248–255. https://doi.org/10.1109/CVPR.2009.5206848

Unsupervised pre-training for person re-identification / D. Fu [et al.] // 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. – Nashville, 2021. – P. 14745–14754. https://doi.org/10.1109/CVPR46437.2021.01451

Fu D., Chen D., Bao J., Yang H., Yuan L., …, Chen D. Unsupervised pre-training for person re-identification. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. Nashville, 2021, pр. 14745–14754. https://doi.org/10.1109/CVPR46437.2021.01451

Богатырева, А. А. Исследование способности к transfer learning сверточных нейронных сетей, обученных на ImageNet / А. А. Богатырева, А. Р. Виноградова, С. А. Тихомирова // Междунар. журнал прикладных и фундаментальных исследований. – 2019. – № 7. – С. 106–111.

Bogatyreva A. A., Vinogradova A. R., Tikhomirova S. A. Study of the ability of convolution neural networks pretrained on ImageNet to transfer learning. Mezhdunarodnyj zhurnal prikladnyh i fundamental'nyh issledovanij [International Journal of Applied and Fundamentall Research], 2019, no. 7, pp. 106–111 (In Russ.).

Конарев, Д. И. Повышение точности предварительно обученных нейронных сетей путем тонкой настройки / Д. И. Конарев, А. А. Гуламов // Материалы конф. «Информационные технологии в управлении», Санкт-Петербург, 6–8 окт. 2020 г. – СПб., 2020. – С. 200–212.

Konarev D. I., Gulamov A. A. Improving the accuracy of pretrained neural networks by fine-tuning. Materialy konferencii "Informacionnye tehnologii v upravlenii", Sankt-Peterburg, 6–8 oktjabrja 2020 g. [Proceedings of the Conference "Information Technologies in Management", Saint Petersburg, 6–8 October 2020]. Saint Petersburg, 2020, рр. 200–212 (In Russ.).

DeVries, T. Improved Regularization of Convolutional Neural Networks with CutOut / T. DeVries, G. W. Taylor. – 2017. – Mode of access: https://doi.org/10.48550/arXiv.1708.04552. – Date of access: 09.08.2022.

DeVries T., Taylor G. W. Improved Regularization of Convolutional Neural Networks with CutOut, 2017. Available at: https://doi.org/10.48550/arXiv.1708.04552 (accessed 09.08.2022).

Dropout: A simple way to prevent neural networks from overfitting / N. Srivastava [et al.] // J. of Machine Learning Research. – 2014. – No. 15. – P. 1929–1958. https://doi.org/10.5555/2627435.2670313

Srivastava N., Hinton G. E., Krizhevsky A., Sutskever I., Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 2014, no. 15, pp. 1929–1958. https://doi.org/10.5555/2627435.2670313

Choice of activation function in convolution neural network for person re-identification in video surveillance systems / H. Chen [et al.] // Programming and Computer Software. – 2022. – Vol. 48, no. 5. – P. 312–321. http://doi.org/10.1134/S0361768822050036

Chen H., Ihnatsyeva S., Bohush R., Ablameyko S. Choice of activation function in convolution neural network for person re-identification in video surveillance systems. Programming and Computer Software, 2022, vol. 48, no. 5, pp. 312–321. http://doi.org/10.1134/S0361768822050036

Random Erasing Data Augmentation / Z. Zhong [et al.]. – 2020. – Mode of access: https://doi.org/10.1609/AAAI.V34I07.7000. – Date of access: 09.08.2022.

Zhong Z., Zheng L., Kang G., Li S., Yang Y. Random Erasing Data Augmentation, 2020. Available at: https://doi.org/10.1609/AAAI.V34I07.7000 (accessed 09.08.2022).

CutMix: Regularization strategy to train strong classifiers with localizable features / S. Yun [et al.] // 2019 IEEE/CVF Intern. Conf. on Computer Vision (ICCV), Seoul, Korea (South), 27 Oct. – 2 Nov. 2019. – Seoul, 2019. – P. 6022–6031. https://doi.org/10.1109/ICCV.2019.00612

Yun S., Han D., Oh S., Chun S., Choe J., Yoo Y. J. CutMix: Regularization strategy to train strong classifiers with localizable features. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 27 October – 2 November 2019. Seoul, 2019, pр. 6022–6031. https://doi.org/10.1109/ICCV.2019.00612

Cut-thumbnail: A novel data augmentation for convolutional neural network / T. Xie [et al.] // Proc. of the 29th ACM Intern. Conf. on Multimedia, Virtual Event, China, 20–24 Oct. 2021. – Virtual Event, China, 2021. – Р. 1627–1635. https://doi.org/10.1145/3474085.3475302

Xie T., Cheng X., Wang X., Liu M., Deng J., …, Liu M. Cut-thumbnail: A novel data augmentation for convolutional neural network. Proceedings of the 29th ACM International Conference on Multimedia, Virtual Event, China, 20–24 October 2021. Virtual Event, China, 2021. https://doi.org/10.1145/3474085.3475302

Mixup: Beyond Empirical Risk Minimization / H. Zhang [et al.]. – 2018. – Mode of access: https://doi.org/10.48550/arXiv.1710.09412. – Date of access: 09.08.2022.

Zhang H., Cissé M., Dauphin Y., Lopez-Paz D. Mixup: Beyond Empirical Risk Minimization, 2018. Available at: https://doi.org/10.48550/arXiv.1710.09412 (accessed 09.08.2022).

ImageNet-Trained CNNs are Biased Towards Texture; Increasing Shape Bias Improves Accuracy and Robustness / R. Geirhos [et al.]. – 2019. – Mode of access: https://doi.org/10.48550/arXiv.1811.12231. – Date of access: 09.08.2022.

Geirhos R., Rubisch P., Michaelis C., Bethge M., Wichmann F., Brendel W. ImageNet-Trained CNNs are Biased Towards Texture; Increasing Shape Bias Improves Accuracy and Robustness, 2019. Available at: https://doi.org/10.48550/arXiv.1811.12231 (accessed 09.08.2022).

Gong, Y. An Effective Data Augmentation for Person Re-identification / Y. Gong, Z. Zeng. – 2021. – Mode of access: https://doi.org/10.48550/arXiv.2101.08533. – Date of access: 09.08.2022.

Gong Y., Zeng Z. An Effective Data Augmentation for Person Re-identification, 2021. Available at: https://doi.org/10.48550/arXiv.2101.08533 (accessed 09.08.2022).

Adversarially occluded samples for person re-identification / H. Huang [et al.] // 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. – Salt Lake City, 2018. – P. 5098–5107. https://doi.org/10.1109/CVPR.2018.00535

Huang H., Li D., Zhang Z., Chen X., Huang K. Adversarially occluded samples for person re-identification. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. Salt Lake City, 2018, pр. 5098–5107. https://doi.org/10.1109/CVPR.2018.00535

Deep learning for person re-identification: A survey and outlook / M. Ye [et al.] // IEEE Transactions on Pattern Analysis and Machine Intelligence. – 2021. – Vol. 44, iss. 6. – Р. 2872–2893. https://doi.org/10.1109/TPAMI.2021.3054775

Ye M., Shen J., Lin G., Xiang T., Shao L., Hoi S. C. Deep learning for person re-identification: A survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, vol. 44, iss. 6, рр. 2872–2893. https://doi.org/10.1109/TPAMI.2021.3054775

Deep residual learning for image recognition / K. He [et al.] // 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. – Las Vegas, 2016. – P. 770–778. https://doi.org/10.1109/cvpr.2016.90

He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. Las Vegas, 2016, рр. 770–778. https://doi.org/10.1109/cvpr.2016.90

Huang, G. Densely connected convolutional networks / G. Huang, Z. Liu, K. Q. Weinberger // 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. – Honolulu, 2017. – P. 2261–2269. https://doi.org/10.1109/CVPR.2017.243

Huang G., Liu Z., Weinberger K. Q. Densely connected convolutional networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. Honolulu, 2017, рр. 2261–2269. https://doi.org/10.1109/CVPR.2017.243

Scalable person re-identification: A benchmark / L. Zheng [et al.] // 2015 IEEE Intern. Conf. on Computer Vision (ICCV), Santiago, Chile, 7–13 Dec. 2015. – Santiago, 2015. – P. 1116–1124. https://doi.org/10.1109/ICCV.2015.133

Zheng L., Shen L., Tian L., Wang S., Wang J., Tian Q. Scalable person re-identification: A benchmark. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. Santiago, 2015, рр. 1116–1124. https://doi.org/10.1109/ICCV.2015.133

Performance Measures and a Data Set for Multi-target, Multi-camera Tracking / E. Ristani [et al.]. – 2016. – Mode of access: https://doi.org/10.1007/978-3-319-48881-3_2. – Date of access: 09.08.2022.

Ristani E., Solera F., Zou R. S., Cucchiara R., Tomasi C. Performance Measures and a Data Set for Multi-target, Multi-camera Tracking, 2016. Available at: https://doi.org/10.1007/978-3-319-48881-3_2 (accessed 09.08.2022).

Person transfer GAN to bridge domain gap for person re-identification / L. Wei [et al.] // 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. – Salt Lake City, 2018. – P. 79–88. https://doi.org/10.1109/CVPR.2018.00016

Wei L., Zhang S., Gao W., Tian Q. Person transfer GAN to bridge domain gap for person re-identification. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. Salt Lake City, 2018, рр. 79–88. https://doi.org/10.1109/CVPR.2018.00016

The authors declare that there are no conflicts of interest present.