The goals of the Human Genome Initiatives are to develop genetic and physical maps, determine the DNA sequences, and identify all of the genes encoded by the genomes of human and model organisms. As the genetic maps of both human and mouse are reaching to closure, the development and implementation c efficient gene identification strategy become more important for the genome project as well as for the positions cloning of disease related genes. To address this problem, we propose a substantially improved version of the "whole genome approach," which is to randomly isolate and map a large number of cDNAs. The first step of this project is to sequence cDNA clones chosen randomly from a mouse embryo equalize cDNA library, which contains short 3 '- untranslated regions (UTR) of at least 15,000 cDNA species with little redundancy. This library has been made from cDNA pools derived from mouse embryos of all developmental stages by a special cDNA normalization method which we have invented. The use of this unique cDNA library will allow us to collect efficiently expressed sequence tags (ESTs) from the 3'-UTRs of about 1,200 new genes by avoiding the redundant sampling of the same cDNA species. The second step of this project is to convert these sequences into PCR primer pairs that can specifically amplify each cDNA. The use of the 3'- UTR as the target for PCR amplification increases the possibility to find sequence polymorphisms between the genomic sequences of C57BL\6J and Mus spretus (a wild mouse strain). Because we have shown that about 60% of cDNAs can be classified in this "biallelic polymorphic ESTs (bESTs)," we expect to generate about 700 genetic markers from the collection of cDNAs. These bEST markers will be genetically mapped by typing two interspecific backcross mouse panels from the Jackson Laboratory. The BSB panel consists of 94 animals [(C57BL\6J x M. spretus) x C57BL\6J] and the BSS panel consists of 94 animals [(C57BL/6J x SPRET/Ei) x SPRET/Ei]. To achieve a high throughput, we will mainly use bEST markers showing restriction fragment length polymorphisms (RFLPs), resulting in the mapping of about 600 bEST markers. This will be about 1% of all mouse genes. We also propose to develop PCR-based methods and resources to obtain sequence information from the protein coding regions of these cDNAs, because the coding region sequences are more informative than the 3'- UTR sequences to deduce the putative function of cDNAs. Through the achievement of these goals, this project will provide a solid evaluation of the whole genome approach for the gene identification as well as useful markers for mouse biologists and genome researchers. In addition, genetic mapping of about 600 cDNAs will have immediate outcome such as the identification of mouse mutant genes.