1. Yoo I.-C., Lim H., Yook D. Formant-based robust voice activity detection. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2015, vol. 23, no. 12, rr. 2238-2245. https://doi.org/10.1109/TASLP.2015.2476762
2. Pang J. Spectrum energy based voice activity detection. The 7th IEEE Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, 9-11 January 2017. Las Vegas, 2017, pp. 1-5. https://doi.org/10.1109/CCWC.2017.7868454
3. Kinnunen T., Chernenko E., Tuononen M., Fränti P., Li H. Voice activity detection using MFCC features and support vector machine. The 12th International Conference on Speech and Computer (SPECOM07), Moscow, Russia, 15-18 October 2007. Moscow, 2007, vol. 2, pp. 556-561.
4. Zazo R., Sainath T. N., Simko G., Parada C. Feature learning with raw-waveform CLDNNs for voice activity detection. 17 th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, 8-12 September 2016. San Francisco, 2016, pp. 3668-3672. https://doi.org/10.21437/Interspeech.2016-268
5. Zhang X., Wu J. Denoising deep neural networks based voice activity detection. International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26-31 May 2013. Vancouver, 2013, pp. 853-857. https://doi.org/10.1109/ICASSP.2013.6637769
6. Hughes T., Mierle K. Recurrent neural networks for voice activity detection. International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26-31 May 2013. Vancouver, 2013, pp. 7378-7382. https://doi.org/10.1109/ICASSP.2013.6639096
7. Eyben F., Weninger F., Squartini S., Schuller B. Real-life voice activity detection with LSTM Recurrent Neural Networks and an application to Hollywood movies. International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26-31 May 2013. Vancouver, 2013, pp. 483-487. https://doi.org/10.1109/ICASSP.2013.6637694
8. Wang Q., Du J., Bao X., Wang Z.-R., Dai L.-R., Lee C.-H. A universal VAD based on jointly trained deep neural networks. 16 th Annual Conference of the International Speech Communication Association, Dresden, Germany, 6-10 September 2015. Dresden, 2015, rr. 2282-2286.
9. Ryant N., Liberman M., Yuan J. Speech activity detection on youtube using deep neural networks. 14 th Annual Conference of the International Speech Communication Association, Lyon, France, 25-29 August 2013. Lyon, 2013, pp. 728-731.
10. Snyder D., Chen G., Povey D. Musan: a Music, Speech, and Noise Corpus, 2015. Available at: https://arxiv.org/abs/1510.08484 (accessed 20.10.2019).
11. Kasi K., Zahorian S. A. Yet another algorithm for pitch tracking. International Conference on Acoustics, Speech, and Signal Processing, Orlando, 13-17 May 2002. Orlando, 2002, vol. 1, rr. 361-364. https://doi.org/10.1109/ICASSP.2002.5743729
12. Kingma D. P., Ba J. Adam: a Method for Stochastic Optimization, 2014. Available at: https://arxiv.org/abs/1412.6980 (accessed 20.10.2019).