Computational studies of biomolecules

Chen, Sih-Yu

Show simple item record

Files in this item

Name:: Sih-YuChenPhDThesis.pdf
Size:: 7.123Mb
Format:: PDF

View/Open

Item metadata

dc.contributor.advisor	Mitchell, John B. O.
dc.contributor.advisor	Westwood, Nicholas James
dc.contributor.author	Chen, Sih-Yu
dc.coverage.spatial	162 p.	en_US
dc.date.accessioned	2017-06-22T13:45:10Z
dc.date.available	2017-06-22T13:45:10Z
dc.date.issued	2017-06-21
dc.identifier.uri	https://hdl.handle.net/10023/11064
dc.description.abstract	In modern drug discovery, lead discovery is a term used to describe the overall process from hit discovery to lead optimisation, with the goal being to identify drug candidates. This can be greatly facilitated by the use of computer-aided (or in silico) techniques, which can reduce experimentation costs along the drug discovery pipeline. The range of relevant techniques include: molecular modelling to obtain structural information, molecular dynamics (which will be covered in Chapter 2), activity or property prediction by means of quantitative structure activity/property models (QSAR/QSPR), where machine learning techniques are introduced (to be covered in Chapter 1) and quantum chemistry, used to explain chemical structure, properties and reactivity. This thesis is divided into five parts. Chapter 1 starts with an outline of the early stages of drug discovery; introducing the use of virtual screening for hit and lead identification. Such approaches may roughly be divided into structure-based (docking, by far the most often referred to) and ligand-based, leading to a set of promising compounds for further evaluation. Then, the use of machine learning techniques, the issue of which will be frequently encountered, followed by a brief review of the "no free lunch" theorem, that describes how no learning algorithm can perform optimally on all problems. This implies that validation of predictive accuracy in multiple models is required for optimal model selection. As the dimensionality of the feature space increases, the issue referred to as "the curse of dimensionality" becomes a challenge. In closing, the last sections focus on supervised classification Random Forests. Computer-based analyses are an integral part of drug discovery. Chapter 2 begins with discussions of molecular docking; including strategies incorporating protein flexibility at global and local levels, then a specific focus on an automated docking program – AutoDock, which uses a Lamarckian genetic algorithm and empirical binding free energy function. In the second part of the chapter, a brief introduction of molecular dynamics will be given. Chapter 3 describes how we constructed a dataset of known binding sites with co-crystallised ligands, used to extract features characterising the structural and chemical properties of the binding pocket. A machine learning algorithm was adopted to create a three-way predictive model, capable of assigning each case to one of the classes (regular, orthosteric and allosteric) for in silico selection of allosteric sites, and by a feature selection algorithm (Gini) to rationalize the selection of important descriptors, most influential in classifying the binding pockets. In Chapter 4, we made use of structure-based virtual screening, and we focused on docking a fluorescent sensor to a non-canonical DNA quadruplex structure. The preferred binding poses, binding site, and the interactions are scored, followed by application of an ONIOM model to re-score the binding poses of some DNA-ligand complexes, focusing on only the best pose (with the lowest binding energy) from AutoDock. The use of a pre-generated conformational ensemble using MD to account for the receptors' flexibility followed by docking methods are termed “relaxed complex” schemes. Chapter 5 concerns the BLUF domain photocycle. We will be focused on conformational preference of some critical residues in the flavin binding site after a charge redistribution has been introduced. This work provides another activation model to address controversial features of the BLUF domain.	en_US
dc.language.iso	en	en_US
dc.publisher	University of St Andrews
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Random forest	en_US
dc.subject	Classification	en_US
dc.subject	Allosteric site	en_US
dc.subject	CHES	en_US
dc.subject	Docking	en_US
dc.subject	AutoDock	en_US
dc.subject	Relaxed complex method	en_US
dc.subject	Molecular dynamics	en_US
dc.subject	G-quadruplex	en_US
dc.subject	HPIP-b	en_US
dc.subject	BLUF	en_US
dc.subject.lcc	QH324.2C5
dc.subject.lcsh	Biomolecules--Computer simulation	en
dc.subject.lcsh	Computational biology	en
dc.subject.lcsh	Molecular dynamics	en
dc.title	Computational studies of biomolecules	en_US
dc.type	Thesis	en_US
dc.type.qualificationlevel	Doctoral	en_US
dc.type.qualificationname	PhD Doctor of Philosophy	en_US
dc.publisher.institution	The University of St Andrews	en_US
dc.publisher.department	School of Chemistry	en_US

The following licence files are associated with this item:

This item appears in the following Collection(s)

Chemistry Theses

Show simple item record

Attribution-NonCommercial-NoDerivatives 4.0 International

Except where otherwise noted within the work, this item's licence for re-use is described as Attribution-NonCommercial-NoDerivatives 4.0 International