Although the elucidation of the entire sequence of the human genome is a long-term goal of the Human Genome Project, the efficiency of sequencing genomic DNA is not currently such that it would allow this goal to be realized keeping track of this reason, two of the current objectives of the HGP are to sequence several megabase-sized regions of DNA with high biological interest and to develop more efficient technologies for sequencing. In Project 5, we propose to determine the sequence of about eight Mb in 4 different regions of the genome during the 5-year funding period. We plan to achieve this goal using the transposon-mediated (TM) sequencing strategy as we believe the low levels of sequence redundancy and the ease of sequence assembly that it offers will make this an efficient way to perform megabase-sequencing. We propose to establish a megabase- sequencing group at the Center and will test the transferability of the current TM-sequencing procedure, using a well-characterized cosmid clone as a model system. We propose to develop several improvements in this strategy to eliminate some rate-limiting steps. Specifically, we will develop a method to perform the transposition and sequencing steps directly in cosmids and to develop an efficient PCR approach to map the positions of transposons in these cosmids. We will then use this method to determine the DNA sequence of 4 regions of the genome, selected primarily on the basis of their biological relevance, but also because many of the required reagents are readily available to us. These regions are: a 2 Mb region of chromosome 21q22 implicated in Down syndrome; a gene-rich 2 Mb segment in the pseudoautosomal region of the sex chromosomes; a 2 Mb region around the EGF gene on chromosome 4q25-4q26, implicated in hepatocellular carcinoma and Rieger's syndrome, and a 2 Mb region on chromosome 4q11-4q12 that includes a cluster of receptor tyrosine kinase genes. At the end of the proposed 5 year funding period, we not only will have provided biologically relevant sequence information for several interesting regions of the genome, but also will have tested the efficacy of and improved upon a sequencing strategy that is an alternative to the mainstream shotgun sequencing approach. We believe several such approaches should be attempted to improve sequencing technologies so that the ultimate goal of determining the sequence of the whole genome can be realized.