Bootstrapping linear models, and its applications.

dc.contributor.advisor Warton, David en_US
dc.contributor.advisor Galbraith, Sally en_US Tsang, Lester Hing Fung en_US 2022-03-23T18:49:58Z 2022-03-23T18:49:58Z 2011 en_US
dc.description.abstract The bootstrap is a computationally intensive data analysis technique. It is particularly useful for analysing small datasets, and for estimating the sampling distribution of a statistic when it is intractable. We focus on bootstrap hypothesis testing of linear models. In this context, at present, various versions of the bootstrap are available, and it is not entirely clear from the literature which method is optimal for each situation. The existing literature on bootstrapping linear models was reviewed, and three “rules'' were found in the literature. We confirmed these via simulation. We also identified two outstanding issues. Firstly, which variance estimator should be used when constructing a bootstrap test statistic? Secondly, if resampling residuals, should this be done using the model that was fitted under the null hypothesis (“null model'') or under the alternative hypothesis (“full model'')? To our knowledge, these two questions have not been previously addressed. We provided theoretical results to answer these questions, and subsequently confirmed these via simulation. Our simulations were designed to evaluate both the size and (size-adjusted) power characteristics of the proposed bootstrap schemes. We proposed the use of a sandwich variance estimator for case and score resampling, rather than the naive statistic that is commonly used in practice. Via simulation, we showed that bootstrap test statistics using the sandwich estimator tend to have superior Type I error for case and score resampling, but there was still an issue of which estimator (naive or sandwich) to use for the observed test statistic (t). Best results were achieved when using t-naive for score resampling and t-sandwich for case resampling. One possible explanation for this result is that score resampling conditions on X whereas case resampling does not, and instead treats X as random. We also studied full versus null model residual resampling. We showed that null model resampling has better Type I error in theory, having an asymptotic correlation of one with a "true bootstrap'' procedure, analogous to a result derived in the permutation testing case by Anderson and Robinson (2001). However in practice, this superiority holds only for non-pivotal statistics: for pivotal statistics, both null and full model resampling had accurate Type I error, a discrepancy which we were able to explain theoretically. en_US
dc.language English
dc.language.iso EN en_US
dc.publisher UNSW, Sydney en_US
dc.rights CC BY-NC-ND 3.0 en_US
dc.rights.uri en_US
dc.subject.other Linear models en_US
dc.subject.other Bootstrap en_US
dc.subject.other Hypothesis testing en_US
dc.title Bootstrapping linear models, and its applications. en_US
dc.type Thesis en_US
dcterms.accessRights open access
dcterms.rightsHolder Tsang, Lester Hing Fung
dspace.entity.type Publication en_US
unsw.relation.faculty Science
unsw.relation.originalPublicationAffiliation Tsang, Lester Hing Fung, Mathematics & Statistics, Faculty of Science, UNSW en_US
unsw.relation.originalPublicationAffiliation Warton, David, Mathematics & Statistics, Faculty of Science, UNSW en_US
unsw.relation.originalPublicationAffiliation Galbraith, Sally, Mathematics & Statistics, Faculty of Science, UNSW en_US School of Mathematics & Statistics *
unsw.thesis.degreetype Masters Thesis en_US
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
715.15 KB
Resource type