Files in this item
A computational approach to discovering p53 binding sites in the human genome
Item metadata
dc.contributor.advisor | Barker, Daniel | |
dc.contributor.advisor | Iggo, Richard | |
dc.contributor.author | Lim, Ji-Hyun | |
dc.coverage.spatial | 154 | en_US |
dc.date.accessioned | 2013-03-13T14:43:30Z | |
dc.date.available | 2013-03-13T14:43:30Z | |
dc.date.issued | 2013-06 | |
dc.identifier | uk.bl.ethos.569026 | |
dc.identifier.uri | https://hdl.handle.net/10023/3388 | |
dc.description.abstract | The tumour suppressor p53 protein plays a central role in the DNA damage response/checkpoint pathways leading to DNA repair, cell cycle arrest, apoptosis and senescence. The activation of p53-mediated pathways is primarily facilitated by the binding of tetrameric p53 to two 'half-sites', each consisting of a decameric p53 response element (RE). Functional REs are directly adjacent or separated by a small number of 1-13 'spacer' base pairs (bp). The p53 RE is detected by exact or inexact matches to the palindromic sequence represented by the regular expression [AG][AG][AG]C[AT][TA]G[TC][TC][TC] or a position weight matrix (PWM). The use of matrix-based and regular expression pattern-matching techniques, however, leads to an overwhelming number of false positives. A more specific model, which combines multiple factors known to influence p53-dependent transcription, is required for accurate detection of the binding sites. In this thesis, we present a logistic regression based model which integrates sequence information and epigenetic information to predict human p53 binding sites. Sequence information includes the PWM score and the spacer length between the two half-sites of the observed binding site. To integrate epigenetic information, we analyzed the surrounding region of the binding site for the presence of mono- and trimethylation patterns of histone H3 lysine 4 (H3K4). Our model showed a high level of performance on both a high-resolution data set of functional p53 binding sites from the experimental literature (ChIP data) and the whole human genome. Comparing our model with a simpler sequence-only model, we demonstrated that the prediction accuracy of the sequence-only model could be improved by incorporating epigenetic information, such as the two histone modification marks H3K4me1 and H3K4me3. | en_US |
dc.language.iso | en | en_US |
dc.publisher | University of St Andrews | |
dc.subject | p53 | en_US |
dc.subject | Regulatory regions | en_US |
dc.subject | Bioinformatics | en_US |
dc.subject | Logistic regression | en_US |
dc.subject | Epigenetics | en_US |
dc.subject.lcc | QP552.P25L5 | |
dc.subject.lcsh | p53 protein | en_US |
dc.subject.lcsh | Binding sites (Biochemistry) | en_US |
dc.subject.lcsh | Gene regulatory networks | en_US |
dc.subject.lcsh | Bioinformatics | en_US |
dc.subject.lcsh | Logistic regression analysis | en_US |
dc.subject.lcsh | Epigenetics | en_US |
dc.title | A computational approach to discovering p53 binding sites in the human genome | en_US |
dc.type | Thesis | en_US |
dc.contributor.sponsor | Biotechnology and Biological Sciences Research Council (BBSRC) | en_US |
dc.type.qualificationlevel | Doctoral | en_US |
dc.type.qualificationname | PhD Doctor of Philosophy | en_US |
dc.publisher.institution | The University of St Andrews | en_US |
dc.publisher.department | School of Medicine | en_US |
This item appears in the following Collection(s)
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.