PSYCHOACOUSTICALLY MOTIVATED TIME-FREQUENCY DICTIONARY BUILDING FOR UNIVERSAL SCALABLE AUDIOCODER BASED ON THE SPARSE APPROXIMATION
Abstract
The article studies the process of creating a perceptually-motivated dictionary of the timefrequency functions based on the wavelet packet transform optimized for the input signal frame and its utilization in the universal scalable real-time audiocoder. The article points out the importance of the topic, great attention is paid to the psychoacoustic modelling. It describes the following algorithms: sparse approximation, perceptual adaptation of the wavelet packet decomposition tree, input signal encoding/decoding schemes. The results of the experimental research of the developed coding algorithm and comparison with the modern coding schemes such as Opus and Vorbis based on the objective quality assessment PEAQ – ODG were also given.
About the Authors
V. Y. HerasimovichBelarus
Al. A. Petrovsky
Belarus
References
1. Mallat, S. Matching pursuit with time-frequency dictionaries / S. Mallat, Z. Zhang // IEEE Transactions on Signal Processing. – December, 1993. – Vol. 41, no. 12. – P. 3397–3415.
2. Petrovsky, Al. Hybrid signal decomposition based on instantaneous harmonic parameters and perceptually motivated wavelet packets for scalable audio coding / Al. Petrovsky, E. Azarov, A. Petrovsky // Elsevier, Signal Processing. Special «Issue Fourier Related Transforms for Non-Stationary Signals». – June 2011. – Vol. 91, iss. 6. – P. 1489–1504.
3. Ruiz Reyes, N. Adaptive signal modelling based on sparse approximations for scalable parametric audio coding / N. Ruiz-Reyes, P. Vera Candeas // IEEE Transactions on audio, speech and language processing. – 2010. – Vol. 18, iss. 3. – P. 447–460.
4. Chardon, G. Perceptual matching pursuit with Gabor dictionaries and Time-Frequency Masking / G. Chardon, T. Necciari, P. Balazs // ICASSP’2014. – Florence, Italy, 2014. – P. 3126–3130.
5. Ravelli, E. Union of MDCT bases for audio coding / E. Ravelli, G. Richard, L. Daudet // IEEE Transactions on audio, speech and language processing. – 2008. – Vol. 16, iss. 8. – P. 1361–1372.
6. Mallat, S.A. Wavelet Tour of Signal Processing. The Sparse Way; 3rd ed. / S.A. Mallat. – Burlington, MA : Academic Press, 2008. – 832 p.
7. Strang, H. Wavelets and Filter Banks / H. Strang, T. Nguyen. – Wellesley, MA : WellesleyCambridge Press, 1997. – 520 p.
8. Petrovsky, Al. Scalable parametric audio coder using sparse approximation with frame-toframe perceptually optimized wavelet packet based dictionary / Al. Petrovsky, V. Herasimovich, A. Petrovsky // AES 138th Convention. – Warsaw, Poland, 2015. – Paper 9264.
9. Анализаторы речевых и звуковых сигналов: методы, алгоритмы и практика (с MATLABпримерами) / под ред. А.А. Петровского. – Минск : Бестпринт, 2009. – 456 с.
10. Daubechies, I. Ten lectures on Wavelets / I. Daubechies. – Philadelphia, Pennsylvania : Society for industrial and applied mathematics, 1992. – 357 p.
11. Johnston, J.D. Transform coding of audio signals using perceptual noise criteria / J.D. Johnston // IEEE Journal on Selected Areas in Communications. – February 1988. – Vol. 6, iss. 2. – P. 314–323.
12. Петровский, Ал.А. Построение психоакустической модели в области вейвлеткоэффициентов для перцептуальной обработки звуковых и речевых сигналов / Ал.А. Петровский // Речевые технологии. – 2008. – № 4. – С. 61–71.
13. Painter, T. Perceptual Coding of Digital Audio / T. Painter, A. Spanias // Proceedings of the IEEE. – April 2000. – Vol. 88, iss. 4. – P. 451–515.
14. Umapathy, K. Audio signal processing using time-frequency approaches: coding, classification, fingerprinting, and watermarking / K. Umapathy, B. Ghoraani, S. Krishnan // EURASIP Journal on Advances in Signal Processing. – 2010. – Vol. 2010. – P. 1–28.
15. Goodwin, M. Atomic decompositions of audio signals / M. Goodwin, M. Vetterli // Proceedings of Workshop on Applications of Signal Processing to Audio and Acoustics. – New Paltz, NY, USA, 1997. – P. 1–4.
16. Петровский, Ал.А. Масштабируемые аудиоречевые кодеры на основе адаптивного частотно-временного анализа звуковых сигналов / Ал.А. Петровский, А.А. Петровский // Труды СПИИРАН. – 2017. – № 1(50). – С. 55–92.
17. Petrovsky, Al. Audio/speech coding using the matching pursuit with frame-based psychoacoustic optimized time-frequency dictionaries and its performance evaluation / Al. Petrovsky, V. Herasimovich, A. Petrovsky // Signal Processing: Algorithms, Architectures, Arrangement, and Applications (SPA). – Poznan, Poland, 2016. – P. 225–229.
18. Petrovsky, A. Real-time wavelet packet-based low bit rate audio coding on a dynamic reconfiguration system / A. Petrovsky, D. Krahe, A.A. Petrovsky // AES 114th Convention. – Amsterdam, 2003. – Paper 5778.
19. ITU-R Rec. BS.1387-1, Method for objective measurements of perceived audio quality, 2001.
20. High-quality, low-delay music coding in the Opus codec / J.-M. Valin [et al.] // AES 135th Convention. – NY, USA, 2013. – Paper 8942.
21. Voice coding with Opus / K. Vos [et al.] // AES 135th Convention. – NY, USA, 2013. – Paper 8941.
Review
For citations:
Herasimovich V.Y., Petrovsky A.A. PSYCHOACOUSTICALLY MOTIVATED TIME-FREQUENCY DICTIONARY BUILDING FOR UNIVERSAL SCALABLE AUDIOCODER BASED ON THE SPARSE APPROXIMATION. Informatics. 2017;(4(56)):89-103. (In Russ.)