Consistency and identifiability of the polymorphism-aware phylogenetic models
Abstract
Polymorphism-aware phylogenetic models (PoMo) constitute an alternative approach for species tree estimation from genome-wide data. PoMo builds on the standard substitution models of DNA evolution but expands the classic alphabet of the four nucleotide bases to include polymorphic states. By doing so, PoMo accounts for ancestral and current intra-population variation, while also accommodating population-level processes ruling the substitution process (e.g. genetic drift, mutations, allelic selection). PoMo has shown to be a valuable tool in several phylogenetic applications but a proof of statistical consistency (and identifiability, a necessary condition for consistency) is lacking. Here, we prove that PoMo is identifiable and, using this result, we further show that the maximum a posteriori (MAP) tree estimator of PoMo is a consistent estimator of the species tree. We complement our theoretical results with a simulated data set mimicking the diversity observed in natural populations exhibiting incomplete lineage sorting. We implemented PoMo in a Bayesian framework and show that the MAP tree easily recovers the true tree for typical numbers of sites that are sampled in genome-wide analyses.
Citation
Borges , R & Kosiol , C 2020 , ' Consistency and identifiability of the polymorphism-aware phylogenetic models ' , Journal of Theoretical Biology , vol. 486 , 110074 , pp. 1-6 . https://doi.org/10.1016/j.jtbi.2019.110074
Publication
Journal of Theoretical Biology
Status
Peer reviewed
ISSN
0022-5193Type
Journal article
Rights
Copyright © 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license. (http://creativecommons.org/licenses/by/4.0/)
Description
Funding: Vienna Science and Technology Fund (WWTF) [MA16-061].Collections
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.