This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. In an era of comparative genomics, large-scale cDNA sequencing has led to the discovery of transcripts that have no detectable similarity to existing sequences in the public databases. These novel transcripts have been shown to play important regulatory roles as noncoding RNA or novel proteins. This project will create a resource of novel transcripts within the cattle genome. Such a resource would be of interest to evolutionary, bioinformatics, proteomics and agricultural scientists. Importantly, the characterization of a few predicted novel proteins would provide a glimpse into the molecular basis for unique physical and metabolic adaptations of ruminants. The objectives of this proposal are: 1) Identification and bioinformatic characterization of novel transcripts within the entire cattle genome, and to determine their distribution by in silico mapping;and 2) Structure elucidation of up to five novel proteins using ab initio structure prediction. A bioinformatics scheme for identification of novel transcripts will be designed and implemented. MySQL database will be set up for importing and querying genome data from cattle and other mammalian genomes for comparative mapping. The data size may well exceed 0.5TB. Genome-wide sequence comparisons of cow to other mammalian genomes will be carried out, as well as querying of datasets on a genome-wide scale. The ab initio structure prediction will be carried out on an external server [http://www.bioinfo.rpi.edu/~bystrc/hmmstr/server.ph], as well as using a public-domain software that will be installed locally.