Modelling and inferring neutral and non-neutral forces shaping genomic site frequencies
View/ Open
Date
28/11/2023Author
Supervisor
Grant ID
10.47379/MA16061 (WWTF)
Keywords
Metadata
Show full item recordAbstract
Single nucleotide polymorphisms in samples of DNA sequences from one or multiple populations can be summarised as site frequency spectra. Since polymorphic sites are known to be predominantly biallelic, models for the evolution of allele frequencies that assume low scaled mutation rates are justified. The biallelic boundary-mutation Moran model with reversible mutations (BMM) arises as an approximation to the classic Moran model under this consideration, and it underpins this PhD thesis.
In the introduction, the BMM is presented as a mathematically tractable model that is e cient in its use of site frequency data for inferring mutation and selection parameters.
Chapter 2 of this thesis extends the BMM to include balancing selection, in addition to biased mutations and a directional component (e.g., directional selection or biased gene conversion).
In Chapter 3, discrete and stochastic demographic changes are incorporated into the spectral representation of the neutral BMM. A Hidden Markov Model inspired approach is used to simulate sample spectra under di↵erent scenarios, and propose a new inference method.
A novel class of Hidden Markov Models with ordered hidden states and emission densities (oHMMed) is introduced in Chapter 4 alongside the source code of a corresponding R-package.
In Chapter 5, oHMMed is used to annotate the genome of orangutans according to average levels of GC content and recombination rates. Site frequency spectra of similar regions are subjected to Markov Chain Monte Carlo analyses based on the BMM, and to demographic inference per Chapter 3. They are further characterised by structural genomic features. Overall, this provides a quantification of how biased gene conversion and recombination shape the background variation in hominid site frequency data.
Utilised conjointly, the methods developed in this thesis could help inform an extended null model of evolution, and improve genome scans.
Type
Thesis, PhD Doctor of Philosophy
Rights
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
http://creativecommons.org/licenses/by-nc-nd/4.0/
Embargo Date: 2025-06-22
Embargo Reason: Thesis restricted in accordance with University regulations. Restricted until 22nd June 2025
Collections
Except where otherwise noted within the work, this item's licence for re-use is described as Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Related items
Showing items related by title, author, creator and subject.
-
Distinguishing between models of mammalian gene expression : telegraph-like models versus mechanistic models
Braichenko, Svitlana; Holehouse, James; Grima, Ramon (2021-10-06) - Journal articleTwo-state models (telegraph-like models) have a successful history of predicting distributions of cellular and nascent mRNA numbers that can well fit experimental data. These models exclude key rate limiting steps, and ... -
Computational modelling of cancer development and growth : modelling at multiple scales and multiscale modelling
Szymanska, Zuzanna; Cytowski, Maciej; Mitchell, Elaine; Macnamara, Cicely K.; Chaplain, Mark A. J. (2018-05) - Journal articleIn this paper, we present two mathematical models related to different aspects and scales of cancer growth. The first model is a stochastic spatiotemporal model of both a synthetic gene regulatory network (the example of ... -
Correct model-to-model transformation for formal verification
Meedeniya, Dulani Apeksha (University of St Andrews, 2013-06-26) - ThesisModern software systems have increasingly higher expectations on their reliability, in particular if the systems are critical and real-time. The development of these complex software systems requires strong modelling and ...