An investigation of human protein interactions using the comparative method
MetadataShow full item record
There is currently a large increase in the speed of production of DNA sequence data as next generation sequencing technologies become more widespread. As such there is a need for rapid computational techniques to functionally annotate data as it is generated. One computational method for the functional annotation of protein-coding genes is via detection of interaction partners. If the putative partner has a functional annotation then this annotation can be extended to the initial protein via the established principle of “guilt by association”. This work presents a method for rapid detection of functional interaction partners for proteins through the use of the comparative method. Functional links are sought between proteins through analysis of their patterns of presence and absence amongst a set of 54 eukaryotic organisms. These links can be either direct or indirect protein interactions. These patterns are analysed in the context of a phylogenetic tree. The method used is a heuristic combination of an established accurate methodology involving comparison of models of evolution the parameters of which are estimated using maximum likelihood, with a novel technique involving the reconstruction of ancestral states using Dollo parsimony and analysis of these reconstructions through the use of logistic regression. The methodology achieves comparable specificity to the use of gene coexpression as a means to predict functional linkage between proteins. The application of this method permitted a genome-wide analysis of the human genome, which would have otherwise demanded a potentially prohibitive amount of computational resource. Proteins within the human genome were clustered into orthologous groups. 10 of these proteins, which were ubiquitous across all 54 eukaryotes, were used to reconstruct a phylogeny. An application of the heuristic predicted a set of functional protein interactions in human cells. 1,142 functional interactions were predicted. Of these predictions 1,131 were not present in current protein-protein interaction databases.
Thesis, PhD Doctor of Philosophy
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.