Improving person re-identification based on two-stage training of convolutional neural networks and augmentation
https://doi.org/10.37661/1816-0301-2023-20-1-40-54
Abstract
Objectives. The main goal is to improve person re-identification accuracy in distributed video surveillance systems.
Methods. Machine learning methods are applied.
Result. A technology for two-stage training of convolutional neural networks (CNN) is presented, characterized by the use of image augmentation for the preliminary stage and fine tuning of weight coefficients based on the original images set for training. At the first stage, training is carried out on augmented data, at the second stage, fine tuning of the CNN is performed on the original images, which allows minimizing the losses and increasing model efficiency. The use of different data at different training stages does not allow the CNN to remember training examples, thereby preventing overfitting.
Proposed method as expanding the training sample differs as it combines an image pixels cyclic shift, color exclusion and fragment replacement with a reduced copy of another image. This augmentation method allows to get a wide variety of training data, which increases the CNN robustness to occlusions, illumination, low image resolution, dependence on the location of features.
Conclusion. The use of two-stage learning technology and the proposed data augmentation method made it possible to increase the person re-identification accuracy for different CNNs and datasets: in the Rank1 metric by 4–21 %; in the mAP by 10–31 %; in the mINP by 39–60 %.
About the Authors
S. A. IhnatsyevaBelarus
Sviatlana A. Ihnatsyeva, M. Sc. (Eng.), Postgraduate Student of the Department of Computing Systems and Networks
st. Blokhina, 29, Novopolotsk, 211440
R. P. Bohush
Belarus
Rykhard P. Bohush, D. Sc. (Eng.), Assoc. Prof., Head of the Department of Computing Systems and Networks
st. Blokhina, 29, Novopolotsk, 211440
References
1. Deng J., Dong W., Socher R., Li L., …, Fei-Fei L. ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009. Miami, 2009, pр. 248–255. https://doi.org/10.1109/CVPR.2009.5206848
2. Fu D., Chen D., Bao J., Yang H., Yuan L., …, Chen D. Unsupervised pre-training for person re-identification. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. Nashville, 2021, pр. 14745–14754. https://doi.org/10.1109/CVPR46437.2021.01451
3. Bogatyreva A. A., Vinogradova A. R., Tikhomirova S. A. Study of the ability of convolution neural networks pretrained on ImageNet to transfer learning. Mezhdunarodnyj zhurnal prikladnyh i fundamental'nyh issledovanij [International Journal of Applied and Fundamentall Research], 2019, no. 7, pp. 106–111 (In Russ.).
4. Konarev D. I., Gulamov A. A. Improving the accuracy of pretrained neural networks by fine-tuning. Materialy konferencii "Informacionnye tehnologii v upravlenii", Sankt-Peterburg, 6–8 oktjabrja 2020 g. [Proceedings of the Conference "Information Technologies in Management", Saint Petersburg, 6–8 October 2020]. Saint Petersburg, 2020, рр. 200–212 (In Russ.).
5. DeVries T., Taylor G. W. Improved Regularization of Convolutional Neural Networks with CutOut, 2017. Available at: https://doi.org/10.48550/arXiv.1708.04552 (accessed 09.08.2022).
6. Srivastava N., Hinton G. E., Krizhevsky A., Sutskever I., Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 2014, no. 15, pp. 1929–1958. https://doi.org/10.5555/2627435.2670313
7. Chen H., Ihnatsyeva S., Bohush R., Ablameyko S. Choice of activation function in convolution neural network for person re-identification in video surveillance systems. Programming and Computer Software, 2022, vol. 48, no. 5, pp. 312–321. http://doi.org/10.1134/S0361768822050036
8. Zhong Z., Zheng L., Kang G., Li S., Yang Y. Random Erasing Data Augmentation, 2020. Available at: https://doi.org/10.1609/AAAI.V34I07.7000 (accessed 09.08.2022).
9. Yun S., Han D., Oh S., Chun S., Choe J., Yoo Y. J. CutMix: Regularization strategy to train strong classifiers with localizable features. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 27 October – 2 November 2019. Seoul, 2019, pр. 6022–6031. https://doi.org/10.1109/ICCV.2019.00612
10. Xie T., Cheng X., Wang X., Liu M., Deng J., …, Liu M. Cut-thumbnail: A novel data augmentation for convolutional neural network. Proceedings of the 29th ACM International Conference on Multimedia, Virtual Event, China, 20–24 October 2021. Virtual Event, China, 2021. https://doi.org/10.1145/3474085.3475302
11. Zhang H., Cissé M., Dauphin Y., Lopez-Paz D. Mixup: Beyond Empirical Risk Minimization, 2018. Available at: https://doi.org/10.48550/arXiv.1710.09412 (accessed 09.08.2022).
12. Geirhos R., Rubisch P., Michaelis C., Bethge M., Wichmann F., Brendel W. ImageNet-Trained CNNs are Biased Towards Texture; Increasing Shape Bias Improves Accuracy and Robustness, 2019. Available at: https://doi.org/10.48550/arXiv.1811.12231 (accessed 09.08.2022).
13. Gong Y., Zeng Z. An Effective Data Augmentation for Person Re-identification, 2021. Available at: https://doi.org/10.48550/arXiv.2101.08533 (accessed 09.08.2022).
14. Huang H., Li D., Zhang Z., Chen X., Huang K. Adversarially occluded samples for person re-identification. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. Salt Lake City, 2018, pр. 5098–5107. https://doi.org/10.1109/CVPR.2018.00535
15. Ye M., Shen J., Lin G., Xiang T., Shao L., Hoi S. C. Deep learning for person re-identification: A survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, vol. 44, iss. 6, рр. 2872–2893. https://doi.org/10.1109/TPAMI.2021.3054775
16. He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. Las Vegas, 2016, рр. 770–778. https://doi.org/10.1109/cvpr.2016.90
17. Huang G., Liu Z., Weinberger K. Q. Densely connected convolutional networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. Honolulu, 2017, рр. 2261–2269. https://doi.org/10.1109/CVPR.2017.243
18. Zheng L., Shen L., Tian L., Wang S., Wang J., Tian Q. Scalable person re-identification: A benchmark. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. Santiago, 2015, рр. 1116–1124. https://doi.org/10.1109/ICCV.2015.133
19. Ristani E., Solera F., Zou R. S., Cucchiara R., Tomasi C. Performance Measures and a Data Set for Multi-target, Multi-camera Tracking, 2016. Available at: https://doi.org/10.1007/978-3-319-48881-3_2 (accessed 09.08.2022).
20. Wei L., Zhang S., Gao W., Tian Q. Person transfer GAN to bridge domain gap for person re-identification. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. Salt Lake City, 2018, рр. 79–88. https://doi.org/10.1109/CVPR.2018.00016
Review
For citations:
Ihnatsyeva S.A., Bohush R.P. Improving person re-identification based on two-stage training of convolutional neural networks and augmentation. Informatics. 2023;20(1):40-54. (In Russ.) https://doi.org/10.37661/1816-0301-2023-20-1-40-54