Objective measurement of temporally localized distortions

Download files
Access & Terms of Use
open access
Copyright: Lu, Wenliang
Altmetric
Abstract
Evaluating speech quality in an objective manner has been the “Holy Grail” of digital speech processing over the last 50 years. The assessment of speech quality using objective measures through the use of computational algorithms, provides increased efficiency and reliability compared to its subjective counterpart. However, existing methods such as SNR, LSD, BSD, PESQ and their variants, either lack sufficient accuracy, or fail to handle a comprehensive range of scenarios. One of the intrinsically problematic issues with these methods, is their reliance on a uni-dimensional quality classification schema. Recent advances in speech quality research have converged on the notion that speech quality is a multi-dimensional space. Research by Voiers, Sen and Hall, have all shown that speech quality can be adequately described using three orthogonal dimensions, whose axes correspond to temporally-localised distortions, frequency-localised distortions, and distortions which are not attributed to the first two. This thesis explores the prediction of temporally-localised distortions, which have been shown to contribute to 55% of the variance of the overall quality. Various features extracted from spectrograms, psychoacoustic masking models and non-linear cochlear models, are explored for the development of a robust representation for temporally-localised distortions. Features extracted from the non-linear cochlear model are shown to yield the best results, achieving correlation coefficients higher than 0.9 with respect to subjective scores.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Lu, Wenliang
Supervisor(s)
Sen, Deep
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2012
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download whole.pdf 5.16 MB Adobe Portable Document Format
Related dataset(s)