Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.advisorMitchell, John B. O.
dc.contributor.authorNath, Neetika
dc.coverage.spatial181en_US
dc.date.accessioned2015-07-03T08:35:58Z
dc.date.available2015-07-03T08:35:58Z
dc.date.issued2015-06-24
dc.identifier.urihttps://hdl.handle.net/10023/6899
dc.description.abstractThe most widely used classification system describing enzyme-catalysed reactions is the Enzyme Commission (EC) number. Understanding enzyme function is important for both fundamental scientific and pharmaceutical reasons. The EC classification is essentially unrelated to the reaction mechanism. In this work we address two important questions related to enzyme function diversity. First, to investigate the relationship between the reaction mechanisms as described in the MACiE (Mechanism, Annotation, and Classification in Enzymes) database and the main top-level class of the EC classification. Second, how well these enzymes biocatalysis are adapted in nature. In this thesis, we have retrieved 335 enzyme reactions from the MACiE database. We consider two ways of encoding the reaction mechanism in descriptors, and three approaches that encode only the overall chemical reaction. To proceed through my work, we first develop a basic model to cluster the enzymatic reactions. Global study of enzyme reaction mechanism may provide important insights for better understanding of the diversity of chemical reactions of enzymes. Clustering analysis in such research is very common practice. Clustering algorithms suffer from various issues, such as requiring determination of the input parameters and stopping criteria, and very often a need to specify the number of clusters in advance. Using several well known metrics, we tried to optimize the clustering outputs for each of the algorithms, with equivocal results that suggested the existence of between two and over a hundred clusters. This motivated us to design and implement our algorithm, PFClust (Parameter-Free Clustering), where no prior information is required to determine the number of cluster. The analysis highlights the structure of the enzyme overall and mechanistic reaction. This suggests that mechanistic similarity can influence approaches for function prediction and automatic annotation of newly discovered protein and gene sequences. We then develop and evaluate the method for enzyme function prediction using machine learning methods. Our results suggest that pairs of similar enzyme reactions tend to proceed by different mechanisms. The machine learning method needs only chemoinformatics descriptors as an input and is applicable for regression analysis. The last phase of this work is to test the evolution of chemical mechanisms mapped onto ancestral enzymes. This domain occurrence and abundance in modern proteins has showed that the / architecture is probably the oldest fold design. These observations have important implications for the origins of biochemistry and for exploring structure-function relationships. Over half of the known mechanisms are introduced before architectural diversification over the evolutionary time. The other halves of the mechanisms are invented gradually over the evolutionary timeline just after organismal diversification. Moreover, many common mechanisms includes fundamental building blocks of enzyme chemistry were found to be associated with the ancestral fold.en_US
dc.language.isoenen_US
dc.publisherUniversity of St Andrews
dc.relationNath N, Mitchell JBO: Is EC class predictable from reaction mechanism? BMC Bioinformatics 2012en_US
dc.relationMavridis L, Nath N, Mitchell JBO: PFClust : a novel parameter free clustering algorithm PFClust : a novel parameter free clustering algorithm. 2013en_US
dc.relationMcDonagh JL, Nath N, De Ferrari L, van Mourik T, Mitchell JBO: Uniting cheminformatics and chemical theory to predict the intrinsic aqueous solubility of crystalline druglike molecules. J Chem Inf Model 2014, 54:844–56.en_US
dc.relationNath N, Mitchell JBO, Caetano-Anollés G: The Natural History of Biocatalytic Mechanisms. PLoS Comput Biol 2014, 10:e1003642.en_US
dc.relationAlderson RG, Ferrari L De, Mavridis L, Mcdonagh JL, John BO, Nath N: Enzyme Informatics. Curr Top Med Chem 2012, 12:1911–1923.en_US
dc.subjectEnzymeen_US
dc.subjectMachine learningen_US
dc.subjectEC numberen_US
dc.subjectEnzyme evolutionen_US
dc.subjectPFClusten_US
dc.subjectClustering analysisen_US
dc.subjectRen_US
dc.subjectStatisticsen_US
dc.subject.lcshQP601.N2
dc.subject.lcshEnzymesen_US
dc.subject.lcshEnzymes--Classificationen_US
dc.subject.lcshEnzymes--Evolutionen_US
dc.titleQuantitative and evolutionary global analysis of enzyme reaction mechanismsen_US
dc.typeThesisen_US
dc.contributor.sponsorScottish Universities Life Sciences Alliance (SULSA)en_US
dc.contributor.sponsorScottish Overseas Research Student Awards Scheme (SORSAS)en_US
dc.type.qualificationlevelDoctoralen_US
dc.type.qualificationnamePhD Doctor of Philosophyen_US
dc.publisher.institutionThe University of St Andrewsen_US
dc.publisher.departmentSchool of Chemistryen_US


This item appears in the following Collection(s)

Show simple item record