DESCRIPTION: (Applicant's abstract) The long-term aim of this work is to develop evolution-based technologies for the study of biological function in open reading frames (ORFs) identified by genome projects. The specific aims of this proposal are to: 1. Implement structure prediction tools to identify long distance homologs in genomic databases. This is the first step towards assigning function to an open reading frame (ORF) with unknown function. Prediction of secondary structure is a powerful tool for identifying long distance homologies that cannot be detected by simple comparison of sequence data. Because structure prediction requires sequence data as the only input, it is a low cost way of expanding the value of existing genome sequence databases. 2. Implement tools for reconstructing the evolutionary history of nuclear families of proteins. The tools will identify in the evolutionary history of nuclear families of proteins episodes of divergent evolution of a type that indicates conserved function, and other episodes of divergent evolution of a type that characterizes divergence without function. These tools will permit the user to assess the likelihood that the function of a homologous protein is the same as that of an ORF obtained from the genome database. This tool also requires only sequence data as input, making it another low cost way of expanding the value of genome sequence databases. 3. Develop tools for correlating episodes of functional evolution in protein families with geological periods where new physiology emerged in metazoan animals. These tools will allow a user to generate hypotheses relating protein evolution to possible functions of ORFs. This tool draws on existing and developing paleontological databases, and couples sequence data directly to biological function. These tools will be implemented within Darwin, a comprehensive platform for performing theoretical and practical genomic analysis. Further, they will directly interface with the new field of experimental paleobiochemistry, which permits the experimental testing of hypotheses relating function and behavior derived from the tools described above.