A computational approach to discovering p53 binding sites in the human genome

Lim, Ji-Hyun

Show simple item record

Files in this item

Name:: Ji-HyunLimPhDThesis.pdf
Size:: 3.831Mb
Format:: PDF

View/Open

Item metadata

dc.contributor.advisor	Barker, Daniel
dc.contributor.advisor	Iggo, Richard
dc.contributor.author	Lim, Ji-Hyun
dc.coverage.spatial	154	en_US
dc.date.accessioned	2013-03-13T14:43:30Z
dc.date.available	2013-03-13T14:43:30Z
dc.date.issued	2013-06
dc.identifier	uk.bl.ethos.569026
dc.identifier.uri	https://hdl.handle.net/10023/3388
dc.description.abstract	The tumour suppressor p53 protein plays a central role in the DNA damage response/checkpoint pathways leading to DNA repair, cell cycle arrest, apoptosis and senescence. The activation of p53-mediated pathways is primarily facilitated by the binding of tetrameric p53 to two 'half-sites', each consisting of a decameric p53 response element (RE). Functional REs are directly adjacent or separated by a small number of 1-13 'spacer' base pairs (bp). The p53 RE is detected by exact or inexact matches to the palindromic sequence represented by the regular expression [AG][AG][AG]C[AT][TA]G[TC][TC][TC] or a position weight matrix (PWM). The use of matrix-based and regular expression pattern-matching techniques, however, leads to an overwhelming number of false positives. A more specific model, which combines multiple factors known to influence p53-dependent transcription, is required for accurate detection of the binding sites. In this thesis, we present a logistic regression based model which integrates sequence information and epigenetic information to predict human p53 binding sites. Sequence information includes the PWM score and the spacer length between the two half-sites of the observed binding site. To integrate epigenetic information, we analyzed the surrounding region of the binding site for the presence of mono- and trimethylation patterns of histone H3 lysine 4 (H3K4). Our model showed a high level of performance on both a high-resolution data set of functional p53 binding sites from the experimental literature (ChIP data) and the whole human genome. Comparing our model with a simpler sequence-only model, we demonstrated that the prediction accuracy of the sequence-only model could be improved by incorporating epigenetic information, such as the two histone modification marks H3K4me1 and H3K4me3.	en_US
dc.language.iso	en	en_US
dc.publisher	University of St Andrews
dc.subject	p53	en_US
dc.subject	Regulatory regions	en_US
dc.subject	Bioinformatics	en_US
dc.subject	Logistic regression	en_US
dc.subject	Epigenetics	en_US
dc.subject.lcc	QP552.P25L5
dc.subject.lcsh	p53 protein	en_US
dc.subject.lcsh	Binding sites (Biochemistry)	en_US
dc.subject.lcsh	Gene regulatory networks	en_US
dc.subject.lcsh	Bioinformatics	en_US
dc.subject.lcsh	Logistic regression analysis	en_US
dc.subject.lcsh	Epigenetics	en_US
dc.title	A computational approach to discovering p53 binding sites in the human genome	en_US
dc.type	Thesis	en_US
dc.contributor.sponsor	Biotechnology and Biological Sciences Research Council (BBSRC)	en_US
dc.type.qualificationlevel	Doctoral	en_US
dc.type.qualificationname	PhD Doctor of Philosophy	en_US
dc.publisher.institution	The University of St Andrews	en_US
dc.publisher.department	School of Medicine	en_US

This item appears in the following Collection(s)

Biology Theses

Show simple item record