References

inform

Информатика

Informatics

1816-03012617-6963

UIIP NASB

inform-152

Research Article

ОБРАБОТКА СИГНАЛОВ, ИЗОБРАЖЕНИЙ, РЕЧИ, ТЕКСТА И РАСПОЗНАВАНИЕ ОБРАЗОВ

SIGNAL, IMAGE, SPEECH, TEXT PROCESSING AND PATTERN RECOGNITION

АДАПТАЦИЯ СКРЫТЫХ МАРКОВСКИХ МОДЕЛЕЙ К РАСПОЗНАВАНИЮ ЭМОЦИОНАЛЬНО ОКРАШЕННОЙ РЕЧИ

ADAPTIVE LEARNING OF HIDDEN MARKOV MODELS FOR EMOTIONAL SPEECH

Ткаченя

А. В.

Tkachenia

A. V.

tkachenia@gmail.com

2014

06102016

032127

2016

Ткаченя А.В.

Tkachenia A.V.

Данная работа распространяется под лицензией Creative Commons Attribution 4.0.

This work is licensed under a Creative Commons Attribution 4.0 License.

https://inf.grid.by/jour/article/view/152

Рассматривается алгоритм интерактивной неконтролируемой оценки параметров скрытых марковских моделей (СММ). Решается задача адаптации СММ к эмоционально окрашенной речи. Для увеличения достоверности уточненных параметров СММ предлагается механизм забывания и обновления. Приводятся функциональная блок-схема рассматриваемого алгоритма адаптации СММ, а также полученные результаты улучшения эффективности распознавания эмоциональной речи.

An on-line unsupervised algorithm for estimating the hidden Markov models (HMM) parame-ters is presented. The problem of hidden Markov models adaptation to emotional speech is solved. To increase the reliability of estimated HMM parameters, a mechanism of forgetting and updating is proposed. A functional block diagram of the hidden Markov models adaptation algorithm is also provided with obtained results, which improve the efficiency of emotional speech recognition.

References1

Baum, L.E. An inequality and associated maximization techniques in statistical estimation for probabilistic functions of Markov processes / L.E. Baum // Inequalities. – 1972. – № 3. – P. 1–8.

A maximization technique occurring in the statistical analysis of probabilistic function of Markov chains / L.E. Baum [et al.]// Ann. Math. Stat. – 1970. – № 41. – P. 164–171.

Juang, B.-H. Maximum likelihood estimation for multivariate mixture observations of Mar-kov chains / B.-H. Juang, S.E. Levinson, M.M. Sondhi // IEEE Trans. Inform. Theory. – 1993. – № 2. – P. 307–309.

Liporace, L.R. Maximum likelihood estimation for multivariate observations of Markov sources / L.R. Liporace // IEEE Trans. Inform. Theory. – 1995. - № 28. – P. 729–734.

Gauvain, J.-L. Maximum a posteriori estimation for multivariate Gaussian mixture observa-tions of Markov chains / J.-L. Gauvain, C.-H. Lee // IEEE Trans. Speech Audio Processing. – 1994. – № 2. – P. 291–298.

Huo, Q. Bayesian adaptive learning of the parameters of hidden Markov model for speech recog-nition / Q. Huo, C. Chan, C.-H. Lee // IEEE Trans. Speech Audio Processing. – 1992. – № 5. – P. 334–345.

Lee, C.-H. A study on speaker adaptation of the parameters of continuous density hidden Markov models / C.-H. Lee, C.-H. Lin, B.-H. Juang // IEEE Trans. Signal Processing. – 1991. – № 39. – P. 806–814.

Matsuoka, T. A study of on-line Bayesian adaptation for HMM-based speech recognition / T. Matsuoka, C.-H. Lee // Proc. EUROSPEECH-93. – Berlin, Germany, 1993. – P. 815–818.

Huo, Q. On-Line Adaptive Learning of the Continuous Density Hidden Markov Model Based on Approximate Recursive Bayes Estimate / Q. Huo, C.-H. Lee // Speech and Audio Processing. – 1997. – № 5. – P. 161–172.

Рылов, А.С. Анализ речи в распознающих системах / А.С. Рылов. – Минск : Бест-принт, 2003. – 264 с.

Bilmes, J. A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estima-tion for Gaussian Mixture and Hidden Markov Models / J. Bilmes // International Computer Science Institute. – 1998. – № 1. – P. 164–191.

The HTK Book (for HTK v. 3.4) / S. Young [et al.]. – Cambridge University Engineering Department, 2006. – 359 p.

Krishnamurthy, V. On-line estimation of hidden Markov model parameters based on the Kullback-Leibler information measure / V. Krishnamurthy, J.B. Moore // IEEE Trans. Signal Processing. – 1993. – № 41 (8). – P. 2557–2573.

Weinstein, E. Sequential algorithms for parameter estimation based on the Kullback-Leibler information measure / E. Weinstein, M. Feder, A.V. Oppenheim // IEEE Trans. Acoust, Speech, Signal Processing. – 1990. – № 38 (9). – P. 1652–1654.

MULTEXT-J. Japanese MULTEXT Prosodic Corpus [Electronic resource]. – Mode of access : http://research.nii.ac.jp/src/en/MULTEXT-J.html. – Date of access : 30.09.2013.

Bou-Ghazale, S.E. A Comparative Study of Traditional and Newly Proposed Features for Recognition of Speech Under Stress / S.E. Bou-Ghazale, J.H.L. Hansen // Speech and Audio Processing. – 2000. – № 8. – P. 429–442.

K-fold cross-validation. Wikipedia [Electronic resource]. – Mode of access : http://en.wikipedia.org/wiki/Cross-validation_%28statistics%29. – Date of access : 18.08.2014.

The authors declare that there are no conflicts of interest present.