Bootstrapping linear models, and its applications.

Tsang, Lester Hing Fung

doi:10.26190/unsworks/23766

Bootstrapping linear models, and its applications.

Download files

Access & Terms of Use

open access
Copyright: Tsang, Lester Hing Fung

CC BY-NC-ND 3.0

Abstract

The bootstrap is a computationally intensive data analysis technique. It is particularly useful for analysing small datasets, and for estimating the sampling distribution of a statistic when it is intractable. We focus on bootstrap hypothesis testing of linear models. In this context, at present, various versions of the bootstrap are available, and it is not entirely clear from the literature which method is optimal for each situation. The existing literature on bootstrapping linear models was reviewed, and three “rules'' were found in the literature. We confirmed these via simulation. We also identified two outstanding issues. Firstly, which variance estimator should be used when constructing a bootstrap test statistic? Secondly, if resampling residuals, should this be done using the model that was fitted under the null hypothesis (“null model'') or under the alternative hypothesis (“full model'')? To our knowledge, these two questions have not been previously addressed. We provided theoretical results to answer these questions, and subsequently confirmed these via simulation. Our simulations were designed to evaluate both the size and (size-adjusted) power characteristics of the proposed bootstrap schemes. We proposed the use of a sandwich variance estimator for case and score resampling, rather than the naive statistic that is commonly used in practice. Via simulation, we showed that bootstrap test statistics using the sandwich estimator tend to have superior Type I error for case and score resampling, but there was still an issue of which estimator (naive or sandwich) to use for the observed test statistic (t). Best results were achieved when using t-naive for score resampling and t-sandwich for case resampling. One possible explanation for this result is that score resampling conditions on X whereas case resampling does not, and instead treats X as random. We also studied full versus null model residual resampling. We showed that null model resampling has better Type I error in theory, having an asymptotic correlation of one with a "true bootstrap'' procedure, analogous to a result derived in the permutation testing case by Anderson and Robinson (2001). However in practice, this superiority holds only for non-pivotal statistics: for pivotal statistics, both null and full model resampling had accurate Type I error, a discrepancy which we were able to explain theoretically.

Persistent link to this record

http://hdl.handle.net/1959.4/50958

DOI

https://doi.org/10.26190/unsworks/23766

Author(s)

Tsang, Lester Hing Fung

Supervisor(s)

Warton, David

Galbraith, Sally

Publication Year

2011

Resource Type

Thesis

Degree Type

Masters Thesis

UNSW Faculty

Files

whole.pdf

715.15 KB

Adobe Portable Document Format

View full record Show statistics

Library

Bootstrapping linear models, and its applications.

Access & Terms of Use

Altmetric

Abstract

Persistent link to this record

DOI

Link to Publisher Version

Link to Open Access Version

Additional Link

Author(s)

Supervisor(s)

Creator(s)

Editor(s)

Translator(s)

Curator(s)

Designer(s)

Arranger(s)

Composer(s)

Recordist(s)

Conference Proceedings Editor(s)

Other Contributor(s)

Corporate/Industry Contributor(s)

Publication Year

Resource Type

Degree Type

UNSW Faculty

Files

Related dataset(s)