Robust, Efficient and Scalable Learning of Point Process

Download files
Access & Terms of Use
open access
Copyright: Chang, Yongzhe
Altmetric
Abstract
This thesis describes novel approaches to learning from time series and point processes, including latent pattern learning, classification, clustering and event prediction, among which similarity learning is the common theme. In the machine learning and data analysis domain, time series and point processes, which can be categorised as marked and unmarked sequences according to whether they have event marks or not, are frequently used to model real-world data in applications such as financial services and healthcare. Learning latent patterns and performing predictions on time sequences in these domains, based on current observations, is an important and significant task. To achieve this goal, tools such as the distance measure used and the clustering or classification algorithm employed play a critical role. However, when analysing time series and point processes in real-world data, it is easy to be influenced by noise points, and the clustering and event prediction that work on both marked and unmarked processes are restricted by many conditions and are not efficient. In this thesis, novel models are proposed for point process analysis to achieve robustness, efficiency and scalability. The contributions of this thesis are threefold. Firstly, after the elimination of noise in similarity learning on Non-Homogeneous Poisson process (NHPP), the real distance among NHPPs is recovered in the presence of noise. Secondly, an efficient similarity measure is defined for the NHPP clustering problem, and a more efficient similarity learning method and more appropriate clustering model are explored. These leads to a more accurate and effective method to cluster NHPPs without having to previously know the number of clusters. Thirdly, while NHPPs are one dimensional, another type of point process, namely the marked point process (MPP) that has at least two dimensions, is considered. Extending analysis to MPPs, an advanced neural network model, namely the Graph Convolutional Network (GCN), is used to explore the relationship among different marks in an MPP, clustering of MPPs and future events prediction. The developed methods have been evaluated on both synthetic and real-world datasets. The empirical study shows that the developed methods are efficient and effective, while also significantly outperforming recent state-of-art methods.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Chang, Yongzhe
Supervisor(s)
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2019
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download public version.pdf 1.37 MB Adobe Portable Document Format
Related dataset(s)