Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorDe Ferrari, L.
dc.contributor.authorMitchell, J.B O.
dc.date.accessioned2014-06-10T11:01:01Z
dc.date.available2014-06-10T11:01:01Z
dc.date.issued2014-05-19
dc.identifier.citationDe Ferrari , L & Mitchell , J B O 2014 , ' From sequence to enzyme mechanism using multi-label machine learning ' , BMC Bioinformatics . https://doi.org/10.1186/1471-2105-15-150en
dc.identifier.issn1471-2105
dc.identifier.otherPURE: 126448048
dc.identifier.otherPURE UUID: 0ab8c2aa-bebd-4c6a-87da-02cdc624c8a5
dc.identifier.otherScopus: 84902078914
dc.identifier.otherORCID: /0000-0002-0379-6097/work/34033391
dc.identifier.otherWOS: 000336938100001
dc.identifier.urihttps://hdl.handle.net/10023/4868
dc.description.abstractBackground: In this work we predict enzyme function at the level of chemical mechanism, providing a finer granularity of annotation than traditional Enzyme Commission (EC) classes. Hence we can predict not only whether a putative enzyme in a newly sequenced organism has the potential to perform a certain reaction, but how the reaction is performed, using which cofactors and with susceptibility to which drugs or inhibitors, details with important consequences for drug and enzyme design. Work that predicts enzyme catalytic activity based on 3D protein structure features limits the prediction of mechanism to proteins already having either a solved structure or a close relative suitable for homology modelling. Results: In this study, we evaluate whether sequence identity, InterPro or Catalytic Site Atlas sequence signatures provide enough information for bulk prediction of enzyme mechanism. By splitting MACiE (Mechanism, Annotation and Classification in Enzymes database) mechanism labels to a finer granularity, which includes the role of the protein chain in the overall enzyme complex, the method can predict at 96% accuracy (and 96% micro-averaged precision, 99.9% macro-averaged recall) the MACiE mechanism definitions of 248 proteins available in the MACiE, EzCatDb (Database of Enzyme Catalytic Mechanisms) and SFLD (Structure Function Linkage Database) databases using an off-theshelf K-Nearest Neighbours multi-label algorithm. Conclusion: We find that InterPro signatures are critical for accurate prediction of enzyme mechanism. We also find that incorporating Catalytic Site Atlas attributes does not seem to provide additional accuracy. The software code (ml2db), data and results are available online at http://sourceforge.net/projects/ml2db/ and as supplementary files.
dc.format.extent13
dc.language.isoeng
dc.relation.ispartofBMC Bioinformaticsen
dc.rightsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.en
dc.subjectQH301 Biologyen
dc.subjectDASen
dc.subject.lccQH301en
dc.titleFrom sequence to enzyme mechanism using multi-label machine learningen
dc.typeJournal articleen
dc.contributor.sponsorBBSRCen
dc.description.versionPublisher PDFen
dc.contributor.institutionUniversity of St Andrews. School of Chemistryen
dc.contributor.institutionUniversity of St Andrews. Biomedical Sciences Research Complexen
dc.contributor.institutionUniversity of St Andrews. EaSTCHEMen
dc.identifier.doihttps://doi.org/10.1186/1471-2105-15-150
dc.description.statusPeer revieweden
dc.identifier.grantnumberBB/I00596X/1en


This item appears in the following Collection(s)

Show simple item record