Consistency and identifiability of the polymorphism-aware phylogenetic models
MetadataShow full item record
Polymorphism-aware phylogenetic models (PoMo) constitute an alternative approach for species tree estimation from genome-wide data. PoMo builds on the standard substitution models of DNA evolution but expands the classic alphabet of the four nucleotide bases to include polymorphic states. By doing so, PoMo accounts for ancestral and current intra-population variation, while also accommodating population-level processes ruling the substitution process (e.g. genetic drift, mutations, allelic selection). PoMo has shown to be a valuable tool in several phylogenetic applications but a proof of statistical consistency (and identifiability, a necessary condition for consistency) is lacking. Here, we prove that PoMo is identifiable and, using this result, we further show that the maximum a posteriori (MAP) tree estimator of PoMo is a consistent estimator of the species tree. We complement our theoretical results with a simulated data set mimicking the diversity observed in natural populations exhibiting incomplete lineage sorting. We implemented PoMo in a Bayesian framework and show that the MAP tree easily recovers the true tree for typical numbers of sites that are sampled in genome-wide analyses.
Borges , R & Kosiol , C 2020 , ' Consistency and identifiability of the polymorphism-aware phylogenetic models ' , Journal of Theoretical Biology , vol. 486 , 110074 , pp. 1-6 . https://doi.org/10.1016/j.jtbi.2019.110074
Journal of Theoretical Biology
Copyright © 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license. (http://creativecommons.org/licenses/by/4.0/)
DescriptionFunding: Vienna Science and Technology Fund (WWTF) [MA16-061].
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.