Publication:
Novel likelihood-based inference for symbolic data analysis

dc.contributor.author Lin, Huan en_US
dc.date.accessioned 2022-03-22T18:51:11Z
dc.date.available 2022-03-22T18:51:11Z
dc.date.issued 2018 en_US
dc.description.abstract Symbolic data analysis (SDA) is a relatively new branch in statistics. It has emerged from the need to consider data that contain information which cannot be satisfactorily represented and modelled within classical data models. SDA is a new paradigm which extends the classical data models to take into account more complete and complex information and serves as an alternative solution to tackle "big data" problems by reducing and summarising data of massive size to "classes" of interest. SDA organises multiple unstructured data tables to a single coherent data table containing symbolic-valued variables, often recorded in the forms of intervals or histograms. There has been a considerable amount of research in this area with many of the existing methods developed based on a uniformity within a symbol assumption. It has shown that this uniformity assumption is unrealistic in solving real-world problems. Likelihood functions are fundamental in statistical inference and to date; two likelihood-based methods for SDA have been introduced. However, while these methods have shown to be beneficial, there are a number of current methodological weaknesses that limit their potential to become an invaluable tool in a modern statistician's toolkit. To this end, we propose new models for performing likelihood-based inference for SDA. Our approach overcomes the need to assume uniformity within a symbol (intervals or histogram bins) assumption which is conventional in SDA literature. Instead, our approach allows for a natural way of specifying the underlying distribution of the data from which symbolic variables are obtained. As a result, our approach enables statistical inference to be made at the underlying data level which may be more desirable, from the point of view of the statistical analyst. In addition, our approach offers an opportunity for statistical analysts to use higher-dimensional symbols to address complex real-world problems. The new models are demonstrated by simulated case studies. In addition, the proposed symbolic likelihood function for histograms has been applied to improve the analytical results of an existing model in measuring aerosol particle number concentration. en_US
dc.identifier.uri http://hdl.handle.net/1959.4/60751
dc.language English
dc.language.iso EN en_US
dc.publisher UNSW, Sydney en_US
dc.rights CC BY-NC-ND 3.0 en_US
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/3.0/au/ en_US
dc.subject.other interval data en_US
dc.subject.other likelihoods en_US
dc.subject.other binned data en_US
dc.subject.other summary statistics en_US
dc.subject.other symbolic data analysis en_US
dc.subject.other Bayesian hierarchical modelling en_US
dc.title Novel likelihood-based inference for symbolic data analysis en_US
dc.type Thesis en_US
dcterms.accessRights open access
dcterms.rightsHolder Lin, Huan
dspace.entity.type Publication en_US
unsw.accessRights.uri https://purl.org/coar/access_right/c_abf2
unsw.identifier.doi https://doi.org/10.26190/unsworks/20863
unsw.relation.faculty Science
unsw.relation.originalPublicationAffiliation Lin, Huan, Mathematics & Statistics, Faculty of Science, UNSW en_US
unsw.relation.school School of Mathematics & Statistics *
unsw.thesis.degreetype PhD Doctorate en_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
public version.pdf
Size:
3.72 MB
Format:
application/pdf
Description:
Resource type