Mixing it up: new methods for finite mixture modelling of multi-species data in ecology

Download files
Access & Terms of Use
open access
Copyright: Hui, Francis
Altmetric
Abstract
In this thesis, new methods are proposed for using finite mixture models to analyse multi-species data in ecology. Developments range from theoretical results to empirical studies, offering contributions to the literatures of finite mixture models, species distribution models, variable selection, cluster analysis, and ordination. To begin, a comparison on several real datasets demonstrates that mixture models offer better predictions of how species communities respond to the environment, compared to modelling species separately. This is achieved by borrowing strength across species -- organisms with similar environmental responses are clustered together, forming a small number of archetypal responses. A major challenge in applying mixture models generally is model selection, and two important contributions are made on this front. On how to choose the number of mixture components, complete likelihood information criteria (despite being a popular approach) are shown to potentially underfit the true number of components. As an alternative, a new observed likelihood information criterion is proposed, that is proven to be order consistent. On how to choose the variables to enter into each component, two new penalties are proposed that exploit the grouped structure of covariates in mixture of regression models, leading to desirable asymptotic and finite sample properties. The performance of all penalised likelihood methods depends critically on the choice of tuning parameter. In the case of adaptive LASSO regression, a new information criterion is proposed for tuning parameter selection that, unlike previous criteria, explicitly accounts for the effect of penalisation on the bias-variance tradeoff. The proposed criterion is shown to outperform many currently used criteria in selecting the tuning parameter. Apart from prediction, multi-species data are commonly analysed using cluster analysis and unconstrained ordination, to study how species composition varies spatially. To this end, a model-based approach to unconstrained ordination is proposed using latent variable models. This approach is then integrated with finite mixture models to produce a unified framework for simultaneous clustering and ordination. Examples and simulation demonstrate the advantages of model-based approaches over distance-based methods. The thesis concludes by discussing several extensions to the methods proposed, with further applications to multi-species data.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Hui, Francis
Supervisor(s)
Warton, David
Foster, Scott
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2014
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download public version.pdf 6.81 MB Adobe Portable Document Format
Related dataset(s)