Abstract
RNA splicing is a key regulatory mechanism required for correct processing of multi-exonic genes. Through alternative
splicing, it also enables diversification of information encoded by a single gene and acts as an additional layer of
regulatory control. Despite technology and software now allowing splicing to be quantified in concert with gene
expression, splicing is rarely investigated, due in part to difficulties in interpretation of differential splicing events. In
addition, splicing is rarely investigated in clinical variant annotation pipelines, despite an estimated 15% of genetic
diseases caused through alterations to splicing.
Chapter 2 addresses the current lack of annotation of a core splicing element — the branchpoint. We use experimental
annotations to develop a machine-learning model which expands annotations from covering 17% to 85% of human
introns, and show that branchpoint identity and number are related to splicing patterns.
Chapter 3 addresses gene expression and splicing dynamics in multiple biological contexts. We show that splicing is
indeed a dynamically regulated process involved in the control of cellular responses, although more loosely controlled
and affecting a different subset of genes than differential gene expression. This work highlighted the need for
interpretive tools to discern which events are capable of producing a functional change to gene products and to
characterise such changes. In Chapters 4 and 5 we developed methods to simulate the consequences of alternative
splicing events in silico and provide automated comparisons of transcript isoforms. Through application of these tools,
we showed that splicing affects different gene sets in different manners, and can aid in the interpretation of such
results.
Lastly, in Chapter 6, we use RNA-Sequencing to identify splicing variants of clinical relevance. Using methods developed
in the previous chapters, we identify genetic variants that fall at splice elements, quantify splicing to identify aberrant
splicing events, and characterise the effects these may have on transcripts — leading to the identification of a causal
variant in 1/5 cases.
Together, the work presented in this thesis comprises a significant advance in the way that splicing is investigated, and
illustrates the importance of exploring splicing patterns to better understand development and disease.