Engineering

Publication Search Results

Now showing 1 - 2 of 2
  • (2022) Li, Bingnan
    Thesis
    With the rapid development of various geospatial technologies including remote sensing, mobile devices, and Global Position System (GPS), spatio-temporal data are abundantly available nowadays. Extracting valuable knowledge from spatio-temporal data is of crucial importance for many real-world applications such as intelligent transportation, social services, and intelligent distribution. With the fast increase of the amount and resolution of spatio-temporal data, traditional data mining methods are becoming obsolete. In recent years, deep learning models such as Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) have made promising achievements in many fields based on the strong ability in automated feature extraction and have been broadly used in different spatio-temporal data mining tasks. Many methods have been developed, and more diverse data were collected in recent decades, however, the existing methods have faced challenges from multi-source geospatial data. This thesis investigates four efficient techniques in different scenarios for spatio-temporal data mining that take advantage of multi-source geospatial data to overcome the limitations of traditional data mining methods. This study investigates spatio-temporal data mining from four different perspectives. Firstly, a multi-elemental geolocation inference method is proposed to predict the location of tweets without geo-tags. Secondly, an optimization model is proposed to detect multiple Areas-of-Interest (AOIs) simultaneously and solve the multi-AOIs detection problem. Thirdly, a multi-task Res-U-Net model with attention mechanism is developed for the extraction of the building roofs and the whole building shapes from remote sensing images, then an offset vector method is used to detect the footprints of the high-rise buildings based on the boundaries of the corresponding building roofs and shapes. Lastly, a novel decoder fusion model is introduced to extract interior road network from remote sensing images and GPS trajectory data. And this method is effective for multi-source data mining. The proposed four methods use different techniques for spatio-temporal data mining to improve the detection performance. Numerous experiments show that the techniques developed in this thesis can detect ground features efficiently and effectively and overcome the limitations of conventional algorithms. The studies demonstrate that exploiting spatial information from multi-source geospatial data can improve the detection accuracy in comparison with single-source geospatial data.

  • (2022) Spooner, Annette
    Thesis
    Clinical data are highly complex and pose challenges to machine learning that can introduce bias or negatively affect performance. Clinical data are typically high-dimensional and of mixed types, they may contain correlated values and missing information and a large proportion of the data is often irrelevant. Clinical measurements are often repeated over time, and the data may be censored, meaning the disease of interest has not yet been observed. Alzheimer’s disease (AD) is a progressive neurodegenerative disease that is ultimately fatal and has no cure. There are more than 55 million people worldwide living with dementia today, with AD thought to account for 60-70% of those cases, and numbers are forecast to triple by 2050. The pathological processes leading to AD begin decades before overt symptoms appear, presenting an opportunity to determine early biomarkers that might help identify individuals at risk of developing AD. Traditionally the Cox proportional hazards model has been used to analyse censored data. But the Cox model does not scale well to high dimensions and is limited by some strict assumptions. Consequently, machine learning algorithms have been adapted to handle censored data. This thesis performs a thorough comparison of the performance and stability of the available machine learning and feature selection methods for survival analysis, identifying their strengths and weaknesses. Some of these methods can be unstable in the presence of high-dimensional or correlated data. This thesis examines the reasons for these instabilities and develops new ensemble feature selection frameworks to improve the stability of feature selection. Data-driven thresholds are also developed to automatically separate the important from the redundant features, and clustering is used to handle correlated features. Improvements in stability of up to 40% are achieved. Clinical data is often collected repeatedly over time. A novel temporal pattern mining algorithm is developed to analyse this temporal data and is combined with temporal abstraction to find patterns common to those who develop AD. Survival analysis shows that these patterns are predictive of AD, with a C-Index of up to 0.74, and a novel visualisation module displays the clinically relevant results in an easily interpretable way.