Automatic assessment of depression from speech: paralinguistic analysis, modelling and machine learning

Download files
Access & Terms of Use
open access
Copyright: Cummins, Nicholas
Altmetric
Abstract
Clinical depression is a prominent cause of disability and burden worldwide. Despite this prevalence, the diagnosis of depression, due to its complex clinical characterisation, is a difficult and time consuming task. Currently, there is a real need for an automated and objective diagnostic aid for use in primary care settings and specialist clinics. Accordingly this thesis investigates the use of paralinguistic cues for automatically assessing a speaker’s level of depression. Investigations are undertaken to establish the effects of depression in spectral representations of speech and their subsequent acoustic models. A novel Probabilistic Acoustic Volume (PAV) method for robustly estimating Acoustic Volume is presented and a Monte Carlo approximation that enables the computation of this measure outlined. Results indicate that reductions in spectral variations can quantitatively characterise speech affected by depression. Within the acoustic models the following statistically significant findings are made across two key datasets: reductions in localised acoustic variance, a flattening of the acoustic trajectory and reductions in three different Acoustic Volume measures. Further results gained using an array of PAV points give strong statistical evidence that the spectral feature space also becomes more concentrated. Together these observations demonstrate there is a reduction in the local and global spread of phonetic events in acoustic space in speech affected by depression. A range of novel approaches for performing depression prediction are also investigated. A comprehensive series of acoustic supervector experiments demonstrate the suitability of the Kullback-Leibler divergence based representation to the task and highlight the difficulties of performing nuisance mitigation within this paradigm. A further series of tests opens up the possibilities for using Relevance Vector Machines when predicting depression using a brute-forced feature space. Of particular interest are tests performed using a novel 2-stage rank regression framework designed specifically for regression analysis using ordinal depression scores. Three unique implementations are shown to match or out-perform corresponding conventional regression systems. Further results presented highlight the benefits of using the framework; most notably that, in contrast to conventional regressor fusion, score level fusion of the two-stage systems consistently improves prediction performance.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Cummins, Nicholas
Supervisor(s)
Epps, Julien
Sethu, Vidhyasaharan
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2016
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download public version.pdf 6.46 MB Adobe Portable Document Format
Related dataset(s)