Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorDe Maio, Nicola
dc.contributor.authorHolmes, Ian
dc.contributor.authorSchlötterer, Christian
dc.contributor.authorKosiol, Carolin
dc.date.accessioned2017-02-09T15:30:11Z
dc.date.available2017-02-09T15:30:11Z
dc.date.issued2013-03
dc.identifier.citationDe Maio , N , Holmes , I , Schlötterer , C & Kosiol , C 2013 , ' Estimating empirical codon hidden Markov models ' , Molecular Biology and Evolution , vol. 30 , no. 3 , pp. 725-36 . https://doi.org/10.1093/molbev/mss266en
dc.identifier.issn0737-4038
dc.identifier.otherPURE: 249098811
dc.identifier.otherPURE UUID: 8ed5edc4-9c44-4fb4-922a-76f428c8bf74
dc.identifier.otherPubMed: 23188590
dc.identifier.otherPubMedCentral: PMC3563974
dc.identifier.otherScopus: 84873573348
dc.identifier.urihttps://hdl.handle.net/10023/10262
dc.description.abstractEmpirical codon models (ECMs) estimated from a large number of globular protein families outperformed mechanistic codon models in their description of the general process of protein evolution. Among other factors, ECMs implicitly model the influence of amino acid properties and multiple nucleotide substitutions (MNS). However, the estimation of ECMs requires large quantities of data, and until recently, only few suitable data sets were available. Here, we take advantage of several new Drosophila species genomes to estimate codon models from genome-wide data. The availability of large numbers of genomes over varying phylogenetic depths in the Drosophila genus allows us to explore various divergence levels. In consequence, we can use these data to determine the appropriate level of divergence for the estimation of ECMs, avoiding overestimation of MNS rates caused by saturation. To account for variation in evolutionary rates along the genome, we develop new empirical codon hidden Markov models (ecHMMs). These models significantly outperform previous ones with respect to maximum likelihood values, suggesting that they provide a better fit to the evolutionary process. Using ECMs and ecHMMs derived from genome-wide data sets, we devise new likelihood ratio tests (LRTs) of positive selection. We found classical LRTs very sensitive to the presence of MNSs, showing high false-positive rates, especially with small phylogenies. The new LRTs are more conservative than the classical ones, having acceptable false-positive rates and reduced power.
dc.format.extent12
dc.language.isoeng
dc.relation.ispartofMolecular Biology and Evolutionen
dc.rights© The Author(s) 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.en
dc.subjectEmpirical cordon modelen
dc.subjectRate heterogeneityen
dc.subjectHidden Markov modelsen
dc.subjectPositive selectionen
dc.subjectDrosophilia substitution patternsen
dc.subjectQH301 Biologyen
dc.subjectQH426 Geneticsen
dc.subject.lccQH301en
dc.subject.lccQH426en
dc.titleEstimating empirical codon hidden Markov modelsen
dc.typeJournal articleen
dc.description.versionPublisher PDFen
dc.contributor.institutionUniversity of St Andrews. School of Biologyen
dc.contributor.institutionUniversity of St Andrews. Centre for Biological Diversityen
dc.identifier.doihttps://doi.org/10.1093/molbev/mss266
dc.description.statusPeer revieweden


This item appears in the following Collection(s)

Show simple item record