Shotgun clone libraries are essential resources for molecular analysis of the human genome and of other genomes important to our well-being. Complete genomic sequencing and comparative genomics reveal new information about metabolic pathways in normal and diseased tissue, such as cancer, diabetes, aging, AIDS, and others. Genomic sequencing also provides critical information about horizontal gene transfer, evolution of protein families, and the genetic repertoire and evolution of species. The potential benefits of genomics have increased the demand for additional libraries, with higher fidelity and more complete coverage. However, current methods of library construction are extremely challenging, time consuming, and costly. For larger genomes, they also leave many uncloned gaps. These obstacles obscure important sequence information and limit the scope of genomes studied. The objectives of this proposal are to provide new vectors and streamlined methods to create libraries that surpass current standards of fidelity, complexity, lack of cloning bias, and ease of analysis. These techniques will provide access to existing clone gaps in the human genome and allow cloning of other recalcitrant genomes. Specific aims include development of a novel linear cloning vector and demonstration of its capacity for stable maintenance of otherwise "unclonable" sequences, such as inverted repeats, microsatellite repeats, and large clones of genomic DNA (>8 kb) from AT-rich pathogens, such as Pneumocystis carinii. The aims also include developing methods of rapid template preparation from single bacterial colonies for bench-scale and high-throughput sequencing. Long-term goals include cloning existing gaps in the human genomic sequence, creating complete libraries from AT-rich and other difficult genomes, and in vitro amplification of 100-kb regions of DNA. The advances proposed by this research will allow cloning and sequencing of additional genomic and cDNA libraries with greater accuracy and lower costs. Improved sequence information from these libraries will accelerate the advances and increase the potential of individual and comparative genomic analyses.