Publication:
Topics in computational statistics

dc.contributor.advisor Kohn, Robert
dc.contributor.advisor Sisson, Scott
dc.contributor.author Yang, Yu
dc.date.accessioned 2022-10-24T04:28:23Z
dc.date.available 2022-10-24T04:28:23Z
dc.date.issued 2022
dc.date.submitted 2022-10-24T01:38:58Z
dc.description.abstract Research in computational statistics develops numerically efficient methods to estimate statistical models, with Monte Carlo algorithms a subset of such methods. This thesis develops novel Monte Carlo methods to solve three important problems in Bayesian statistics. For many complex models, it is prohibitively expensive to run simulation methods such as Markov chain Monte Carlo (MCMC) on the model directly when the likelihood function includes an intractable term or is computationally challenging in some other way. The first two topics investigate models having such likelihoods. The third topic proposes a novel model to solve a popular question in causal inference, which requires solving a computationally challenging problem. The first application is to symbolic data analysis, where classical data are summarised and represented as symbolic objects. The likelihood function of such aggregated-level data is often intractable as it usually includes a high dimensional integral with large exponents. Bayesian inference on symbolic data is carried out in the thesis by using a pseudo-marginal method, which replaces the likelihood function with its unbiased estimate. The second application is to doubly intractable models, where the likelihood includes an intractable normalising constant. The pseudo-marginal method is combined with the introduction of an auxiliary variable to obtain simulation consistent inference. The proposed algorithm offers a generic solution to a wider range of problems, where the existing methods are often impractical as the assumptions required for their application do not hold. The last application is to causal inference using Bayesian additive regression trees (BART), a non-parametric Bayesian regression technique. The likelihood function is complex as it is based on a sum of trees whose structures change dynamically with the MCMC iterates. An extension to BART is developed to estimate the heterogeneous treatment effect, aiming to overcome the regularisation-induced confounding issue which is often observed in the direct application of BART in causal inference.
dc.identifier.uri http://hdl.handle.net/1959.4/100721
dc.language English
dc.language.iso en
dc.publisher UNSW, Sydney
dc.rights CC BY 4.0
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.subject.other symbolic data analysis
dc.subject.other doubly intractable models
dc.subject.other causal inference
dc.subject.other pseudo-marginal methods
dc.title Topics in computational statistics
dc.type Thesis
dcterms.accessRights open access
dcterms.rightsHolder Yang, Yu
dspace.entity.type Publication
unsw.accessRights.uri https://purl.org/coar/access_right/c_abf2
unsw.contributor.advisorExternal Quiroz, Matias; Department of Statistics, Stockholm University
unsw.date.workflow 2022-10-24
unsw.identifier.doi https://doi.org/10.26190/unsworks/24428
unsw.relation.faculty Business
unsw.relation.faculty Science
unsw.relation.school School of Mathematics & Statistics
unsw.relation.school School of Economics
unsw.subject.fieldofresearchcode 490503 Computational statistics
unsw.subject.fieldofresearchcode 380202 Econometric and statistical methods
unsw.thesis.degreetype PhD Doctorate
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
public version.pdf
Size:
2.24 MB
Format:
application/pdf
Description:
Resource type