Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.advisorLynch, Andy G.
dc.contributor.advisorPapathomas, Michail
dc.contributor.authorVelasco Pardo, Víctor
dc.coverage.spatial162en_US
dc.date.accessioned2024-05-13T13:40:48Z
dc.date.available2024-05-13T13:40:48Z
dc.date.issued2024-06-11
dc.identifier.urihttps://hdl.handle.net/10023/29876
dc.description.abstractCancer is a disease driven and characterised by mutations in the DNA. Thanks to massively parallel sequencing technologies, it is now possible to obtain the sequence of a cancer genome. The advent of modern sequencing technologies has allowed researchers to study the mutations involved in tumour development. More recently, attention has been drawn to the `passenger' mutations that are not involved in tumour development but bear fingerprints of the mutational processes that have been operative over a patient's lifetime. Those fingerprints, termed mutational signatures, appear consistently across cancer genomes that have been exposed to the underlying mutational processes. Computational analyses have identified over a hundred such signatures, and it is now possible to estimate the relative prevalence of mutational signatures in a cancer genome. Both types of analyses are perhaps unique in the medical literature, in that no confidence intervals or other representations of uncertainty are demanded when reporting the results. In this thesis, we address the problem of quantifying uncertainty around the reported mutational signatures and their relative prevalence in individual tumours. First, in Chapter 2, we review the available computational methods for mutational signature analyses, assessing the potential of existing approaches to characterise uncertainty. Then, in Chapter 3, we annotate ten statistical challenges. The remainder of the thesis is built on the aim of addressing some of those challenges. To estimate the relative prevalence of mutational signatures in individual tumours, a method that quantifies the uncertainty around the estimated solution is lacking. Moreover, those analyses assume that the true values for the signatures are `known' as they are propagated from previous analyses. In Chapter 4, we suggest a setting where the signatures are `partially known'. We propose a novel approach for this problem, in a Bayesian setting, providing credible intervals around the estimated solution, propagating prior uncertainty regarding `partially known' signatures, and updating prior beliefs about them. Estimation of mutational signatures is often performed in a matrix factorisation setting that is not fully probabilistic. While an alternative fully probabilistic approach is available, a post-processing method is needed to characterise the uncertainty around the reported solution. In Chapter 5, we introduce a novel post-processing approach to quantify uncertainty around the mutational signatures estimated in a cohort of cancer patients, along with software that allows investigators to use the proposed method and visualise results.en_US
dc.language.isoenen_US
dc.subjectCancer genomicsen_US
dc.subjectMutational signaturesen_US
dc.subjectBioinformaticsen_US
dc.subjectBiostatisticsen_US
dc.subjectBayesian statisticsen_US
dc.titleStatistical underpinning of mutational signature analyses of cancer sequencing dataen_US
dc.typeThesisen_US
dc.contributor.sponsorMelville Trusten_US
dc.type.qualificationlevelDoctoralen_US
dc.type.qualificationnamePhD Doctor of Philosophyen_US
dc.publisher.institutionThe University of St Andrewsen_US
dc.rights.embargodate2025-05-08
dc.rights.embargoreasonThesis restricted in accordance with University regulations. Restricted until 8 May 2025en
dc.identifier.doihttps://doi.org/10.17630/sta/912


This item appears in the following Collection(s)

Show simple item record