Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorNath, Neetika
dc.contributor.authorMitchell, John B. O.
dc.date.accessioned2012-07-02T12:01:01Z
dc.date.available2012-07-02T12:01:01Z
dc.date.issued2012-04-24
dc.identifier.citationNath , N & Mitchell , J B O 2012 , ' Is EC class predictable from reaction mechanism? ' , BMC Bioinformatics , vol. 13 , 60 . https://doi.org/10.1186/1471-2105-13-60en
dc.identifier.issn1471-2105
dc.identifier.otherPURE: 23410427
dc.identifier.otherPURE UUID: 7afeb5a7-50d5-4c7b-8d30-fedc251e946f
dc.identifier.otherWOS: 000304912700001
dc.identifier.otherScopus: 84859931913
dc.identifier.otherORCID: /0000-0002-0379-6097/work/34033400
dc.identifier.urihttps://hdl.handle.net/10023/2883
dc.descriptionWe thank the Scottish Universities Life Sciences Alliance (SULSA) and the Scottish Overseas Research Student Awards Scheme of the Scottish Funding Council (SFC) for financial support.en
dc.description.abstractBackground: We investigate the relationships between the EC (Enzyme Commission) class, the associated chemical reaction, and the reaction mechanism by building predictive models using Support Vector Machine (SVM), Random Forest (RF) and k-Nearest Neighbours (kNN). We consider two ways of encoding the reaction mechanism in descriptors, and also three approaches that encode only the overall chemical reaction. Both cross-validation and also an external test set are used. Results: The three descriptor sets encoding overall chemical transformation perform better than the two descriptions of mechanism. SVM and RF models perform comparably well; kNN is less successful. Oxidoreductases and hydrolases are relatively well predicted by all types of descriptor; isomerases are well predicted by overall reaction descriptors but not by mechanistic ones. Conclusions: Our results suggest that pairs of similar enzyme reactions tend to proceed by different mechanisms. Oxidoreductases, hydrolases, and to some extent isomerases and ligases, have clear chemical signatures, making them easier to predict than transferases and lyases. We find evidence that isomerases as a class are notably mechanistically diverse and that their one shared property, of substrate and product being isomers, can arise in various unrelated ways. The performance of the different machine learning algorithms is in line with many cheminformatics applications, with SVM and RF being roughly equally effective. kNN is less successful, given the role that non-local information plays in successful classification. We note also that, despite a lack of clarity in the literature, EC number prediction is not a single problem; the challenge of predicting protein function from available sequence data is quite different from assigning an EC classification from a cheminformatics representation of a reaction.
dc.format.extent13
dc.language.isoeng
dc.relation.ispartofBMC Bioinformaticsen
dc.rights© 2012 Nath and Mitchell; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.en
dc.subjectQD Chemistryen
dc.subject.lccQDen
dc.titleIs EC class predictable from reaction mechanism?en
dc.typeJournal articleen
dc.description.versionPublisher PDFen
dc.contributor.institutionUniversity of St Andrews. School of Chemistryen
dc.contributor.institutionUniversity of St Andrews. Biomedical Sciences Research Complexen
dc.contributor.institutionUniversity of St Andrews. EaSTCHEMen
dc.identifier.doihttps://doi.org/10.1186/1471-2105-13-60
dc.description.statusPeer revieweden
dc.identifier.urlhttp://www.biomedcentral.com/1471-2105/13/60/en


This item appears in the following Collection(s)

Show simple item record