We describe a computational framework that models spatial structure along the genomic sequence simultaneously with the temporal evolutionary path structure and show how such models can be used to discover new functional molecules through cross-genomic sequence comparisons. The framework incorporates a priori high-level knowledge of structural and evolutionary constraints in terms of a hierarchical grammar of evolutionary probabilistic models. In particular, we demonstrate a novel computational method for identifying novel prohormones and the processed peptide sites by producing sequence alignments across many species at the functional-element level. We present experimental results with an initial implementation of the algorithm used to identify potential prohormones by comparing the human and mouse proteins, resulting in high accuracy identification in a known set of proteins and a putative novel hormone from an unknown set. Finally, in order to validate the computational methodology, we present the basic molecular biological characterization of the novel putative peptide hormone, including identification in the brain and regional localizations. The success of this approach will have a great impact on our understanding of GPCRs and associated pathways, and help us identify new targets for drug development.
Download Full PDF Version (Non-Commercial Use)