Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.advisorMitchell, John B. O.
dc.contributor.advisorWestwood, Nicholas James
dc.contributor.authorChen, Sih-Yu
dc.coverage.spatial162 p.en_US
dc.date.accessioned2017-06-22T13:45:10Z
dc.date.available2017-06-22T13:45:10Z
dc.date.issued2017-06-21
dc.identifier.urihttp://hdl.handle.net/10023/11064
dc.description.abstractIn modern drug discovery, lead discovery is a term used to describe the overall process from hit discovery to lead optimisation, with the goal being to identify drug candidates. This can be greatly facilitated by the use of computer-aided (or in silico) techniques, which can reduce experimentation costs along the drug discovery pipeline. The range of relevant techniques include: molecular modelling to obtain structural information, molecular dynamics (which will be covered in Chapter 2), activity or property prediction by means of quantitative structure activity/property models (QSAR/QSPR), where machine learning techniques are introduced (to be covered in Chapter 1) and quantum chemistry, used to explain chemical structure, properties and reactivity. This thesis is divided into five parts. Chapter 1 starts with an outline of the early stages of drug discovery; introducing the use of virtual screening for hit and lead identification. Such approaches may roughly be divided into structure-based (docking, by far the most often referred to) and ligand-based, leading to a set of promising compounds for further evaluation. Then, the use of machine learning techniques, the issue of which will be frequently encountered, followed by a brief review of the "no free lunch" theorem, that describes how no learning algorithm can perform optimally on all problems. This implies that validation of predictive accuracy in multiple models is required for optimal model selection. As the dimensionality of the feature space increases, the issue referred to as "the curse of dimensionality" becomes a challenge. In closing, the last sections focus on supervised classification Random Forests. Computer-based analyses are an integral part of drug discovery. Chapter 2 begins with discussions of molecular docking; including strategies incorporating protein flexibility at global and local levels, then a specific focus on an automated docking program – AutoDock, which uses a Lamarckian genetic algorithm and empirical binding free energy function. In the second part of the chapter, a brief introduction of molecular dynamics will be given. Chapter 3 describes how we constructed a dataset of known binding sites with co-crystallised ligands, used to extract features characterising the structural and chemical properties of the binding pocket. A machine learning algorithm was adopted to create a three-way predictive model, capable of assigning each case to one of the classes (regular, orthosteric and allosteric) for in silico selection of allosteric sites, and by a feature selection algorithm (Gini) to rationalize the selection of important descriptors, most influential in classifying the binding pockets. In Chapter 4, we made use of structure-based virtual screening, and we focused on docking a fluorescent sensor to a non-canonical DNA quadruplex structure. The preferred binding poses, binding site, and the interactions are scored, followed by application of an ONIOM model to re-score the binding poses of some DNA-ligand complexes, focusing on only the best pose (with the lowest binding energy) from AutoDock. The use of a pre-generated conformational ensemble using MD to account for the receptors' flexibility followed by docking methods are termed “relaxed complex” schemes. Chapter 5 concerns the BLUF domain photocycle. We will be focused on conformational preference of some critical residues in the flavin binding site after a charge redistribution has been introduced. This work provides another activation model to address controversial features of the BLUF domain.en_US
dc.language.isoenen_US
dc.publisherUniversity of St Andrews
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectRandom foresten_US
dc.subjectClassificationen_US
dc.subjectAllosteric siteen_US
dc.subjectCHESen_US
dc.subjectDockingen_US
dc.subjectAutoDocken_US
dc.subjectRelaxed complex methoden_US
dc.subjectMolecular dynamicsen_US
dc.subjectG-quadruplexen_US
dc.subjectHPIP-ben_US
dc.subjectBLUFen_US
dc.subject.lccQH324.2C5
dc.subject.lcshBiomolecules--Computer simulationen
dc.subject.lcshComputational biologyen
dc.subject.lcshMolecular dynamicsen
dc.titleComputational studies of biomoleculesen_US
dc.typeThesisen_US
dc.type.qualificationlevelDoctoralen_US
dc.type.qualificationnamePhD Doctor of Philosophyen_US
dc.publisher.institutionThe University of St Andrewsen_US
dc.publisher.departmentSchool of Chemistryen_US


The following license files are associated with this item:

    This item appears in the following Collection(s)

    Show simple item record

    Attribution-NonCommercial-NoDerivatives 4.0 International
    Except where otherwise noted within the work, this item's license for re-use is described as Attribution-NonCommercial-NoDerivatives 4.0 International