Abstract
Species distribution models are useful tools for relating the locations of species in a given region to environmental
factors. This thesis will focus on the modelling of presence-only data, in which information is available about where
species are reported present but not where species are reported absent. The aims of this thesis are to use theoretical
tools from statistics to improve modern presence-only methods of analysis.
This thesis establishes that MAXENT, a popular method in ecology based on maximum entropy, is equivalent to
Poisson point process modelling, a widely-used statistical method for analysing spatial point patterns only recently
applied to species distribution modelling. This equivalence result significantly unifies the presence-only analysis
literature and has important ramifications for MAXENT and point process models. Despite its good predictive
performance, MAXENT has shortcomings in interpretation and implementation that can now be overcome. In
particular, MAXENT users can inherit from point process models some well-developed tools for addressing model
adequacy and the ability to model point interactions.
MAXENT's use of a LASSO penalty is known to improve predictive performance. However, the default penalty
chosen by MAXENT software is ad hoc. Another focus of this thesis is implementing LASSO for point process
models, which has rarely been done previously.
This thesis provides an asymptotic result for applying a LASSO penalty to point process models such that consistent
estimates of model parameters and predictions can be achieved. A new consistent criterion for choosing the LASSO
penalty ( MSI") is consequently developed as an alternative to the default MAXENT penalty which has better
properties. MSI is found to be competitive with traditional methods of choosing the LASSO penalty and generally
superior to the MAXENT penalty in a broad comparison using real and simulated species data.
This extension of point process models regularised with a LASSO penalty ( PPM-LASSO") therefore represents a
significant advance of current species distribution modelling methods by combining the statistical foundations of point
process models and the strong predictive performance of MAXENT via LASSO penalisation. I have developed the
freely-available ppmlasso package for R so that PPM-LASSO models may now be fitted by users.