DESCRIPTION: Protein sequence homology (i.e., descent from a common ancestral sequence) is perhaps the most widely used tool for annotating the putative functions of genes. Homologous proteins often share functions inherited from the common ancestor, so if the function of one protein has been experimentally determined, the function of its homologues can often, but not always, be inferred to be the same. Homology-based inference allows functional data from experimentally tractable model organisms (such as E. coli, yeast, Drosophila, C. elegans and the mouse) to be applied to other organisms, most notably humans. The past several years have seen a dramatic increase in the amount of structured, computationally accessible data available on the functions of proteins and the genes that encode them, primarily using the Gene Ontology (GO). The most useful of these data have been manually entered (}curated}) by a biologist after reading papers in the scientific literature. The goal of this proposal is to leverage these literature-derived ontology annotations by using them, in a carefully curated and structured manner, as the basis for inferred annotations in other organisms. We will utilize and extend existing software developed in our groups to develop a web-accessible environment for curation of GO terms in the context of evolutionary relationships, and link the data to biological pathway data and data standards. We will integrate the software into current GO term annotation projects, and support a broad data ex- change and dissemination plan across GO and pathway ontology curation efforts and the communities of bio- medical researchers they serve. PUBLIC HEALTH RELEVANCE: The current research project provides a cost-effective, accurate methodology for taking experimentally based information about genes in a wide range of well-studied species, and applying this information to understanding human biology, genetics and disease. The results from this methodology will be made broadly available to both researchers and the public, in formats accessible to both people and computers. [unreadable] [unreadable] [unreadable]