Publication Search Results
Results per page
Now showing 1 - 3 of 3
(2022) Yang, YuThesisResearch in computational statistics develops numerically efficient methods to estimate statistical models, with Monte Carlo algorithms a subset of such methods. This thesis develops novel Monte Carlo methods to solve three important problems in Bayesian statistics. For many complex models, it is prohibitively expensive to run simulation methods such as Markov chain Monte Carlo (MCMC) on the model directly when the likelihood function includes an intractable term or is computationally challenging in some other way. The first two topics investigate models having such likelihoods. The third topic proposes a novel model to solve a popular question in causal inference, which requires solving a computationally challenging problem. The first application is to symbolic data analysis, where classical data are summarised and represented as symbolic objects. The likelihood function of such aggregated-level data is often intractable as it usually includes a high dimensional integral with large exponents. Bayesian inference on symbolic data is carried out in the thesis by using a pseudo-marginal method, which replaces the likelihood function with its unbiased estimate. The second application is to doubly intractable models, where the likelihood includes an intractable normalising constant. The pseudo-marginal method is combined with the introduction of an auxiliary variable to obtain simulation consistent inference. The proposed algorithm offers a generic solution to a wider range of problems, where the existing methods are often impractical as the assumptions required for their application do not hold. The last application is to causal inference using Bayesian additive regression trees (BART), a non-parametric Bayesian regression technique. The likelihood function is complex as it is based on a sum of trees whose structures change dynamically with the MCMC iterates. An extension to BART is developed to estimate the heterogeneous treatment effect, aiming to overcome the regularisation-induced confounding issue which is often observed in the direct application of BART in causal inference.
(2022) Balnozan, IgorThesisThis thesis explores the development and novel application of linear panel data methods that use latent grouping variables in the modelling of time-varying unobservable heterogeneity. The methods are tailored for use in microeconomic applications with observational panel data by: a) controlling for individual-specific intercepts; b) focusing on the economic interpretability of the time-varying heterogeneity component of the models; c) addressing the problem of estimating unknown group memberships across an unknown number of latent groups. The most general model studied also allows for latent group structures in the partial effects of observed covariates, where groups in the covariate effects can be independent from groups in the unobservable heterogeneity. Classical and Bayesian statistical methodologies are considered, with the main methodological contributions being in the development of Bayesian approaches. For the kinds of applications studied, the Bayesian methods are shown to have more favourable properties, both in principle and in practice. Empirical applications to retirement decumulation and smoking policy in Australia demonstrate how the methods developed in this thesis may be used to learn about economically meaningful latent behavioural patterns across a range of applications.
Pairwise versus mutual independence: visualisation, actuarial applications and central limit theorems(2023) Boglioni Beaulieu, GuillaumeThesisAccurately capturing the dependence between risks, if it exists, is an increasingly relevant topic of actuarial research. In recent years, several authors have started to relax the traditional 'independence assumption', in a variety of actuarial settings. While it is known that 'mutual independence' between random variables is not equivalent to their 'pairwise independence', this thesis aims to provide a better understanding of the materiality of this difference. The distinction between mutual and pairwise independence matters because, in practice, dependence is often assessed via pairs only, e.g., through correlation matrices, rank-based measures of association, scatterplot matrices, heat-maps, etc. Using such pairwise methods, it is possible to miss some forms of dependence. In this thesis, we explore how material the difference between pairwise and mutual independence is, and from several angles. We provide relevant background and motivation for this thesis in Chapter 1, then conduct a literature review in Chapter 2. In Chapter 3, we focus on visualising the difference between pairwise and mutual independence. To do so, we propose a series of theoretical examples (some of them new) where random variables are pairwise independent but (mutually) dependent, in short, PIBD. We then develop new visualisation tools and use them to illustrate what PIBD variables can look like. We showcase that the dependence involved is possibly very strong. We also use our visualisation tools to identify subtle forms of dependence, which would otherwise be hard to detect. In Chapter 4, we review common dependence models (such has elliptical distributions and Archimedean copulas) used in actuarial science and show that they do not allow for the possibility of PIBD data. We also investigate concrete consequences of the 'nonequivalence' between pairwise and mutual independence. We establish that many results which hold for mutually independent variables do not hold under sole pairwise independent. Those include results about finite sums of random variables, extreme value theory and bootstrap methods. This part thus illustrates what can potentially 'go wrong' if one assumes mutual independence where only pairwise independence holds. Lastly, in Chapters 5 and 6, we investigate the question of what happens for PIBD variables 'in the limit', i.e., when the sample size goes to infi nity. We want to see if the 'problems' caused by dependence vanish for sufficiently large samples. This is a broad question, and we concentrate on the important classical Central Limit Theorem (CLT), for which we fi nd that the answer is largely negative. In particular, we construct new sequences of PIBD variables (with arbitrary margins) for which a CLT does not hold. We derive explicitly the asymptotic distribution of the standardised mean of our sequences, which allows us to illustrate the extent of the 'failure' of a CLT for PIBD variables. We also propose a general methodology to construct dependent K-tuplewise independent (K an arbitrary integer) sequences of random variables with arbitrary margins. In the case K = 3, we use this methodology to derive explicit examples of triplewise independent sequences for which no CLT hold. Those results illustrate that mutual independence is a crucial assumption within CLTs, and that having larger samples is not always a viable solution to the problem of non-independent data.