The two widely used Next Generation Sequencing (NGS) technologies are 454 Sequencing and Illumina sequencing. We propose to determine the best sequencing strategy, that is the optimal mix of 454 and Illumina read and mate pair data to produce the best possible assembly at the lowest cost. We propose to continue developing our software for closing gaps and fixing mis- assemblies by our shooting method. We can extend the method to use additional NGS reads and mate pairs to close gaps in existing assemblies to increase contiguity, and find and correct mis-assemblies. This method can be used as a cheaper alternative to traditional finishing techniques. The final product of any assembly project is a set of the chromosome sequence files. We propose to develop improved software capable of producing chromosome sequences from the assembled contigs using mate pair and marker data. Our preliminary version works for assemblies that have large contigs (N50 size >100Kb). Genomes assembled from the NGS data typically have small contigs (N50 size of 10-20Kb). We propose to extend development of the software so that it is applicable to genome assemblies of the NGS data. We propose to employ the experience that we gained in the previous project period to re-assemble the genomes of chicken, rat, and possibly other genomes of public health interest from the existing Trace Archive data combined with (if available) additional NGS data. The NGS data is getting cheaper. Now there are many groups interested in sequencing various genomes. Thus we propose to produce de novo assemblies of insect, plant genomes and other organisms of public health interest in collaboration with the centers that generate the data. Our goal is to serve as an expert genome assembly group that provides its services and techniques to the community. PUBLIC HEALTH RELEVANCE: Advances in the sequencing technologies made it possible to obtain large amounts of sequence data quickly and at low cost, compared to the Sanger sequencing. Our goals are to contribute our techniques, software and expertise in assembly of the short read data to the community. We will continuously improve our methods to obtain the best possible assemblies of the new genomes sequenced with the latest technologies. The ultimate goal of this project is to improve public health by better understanding the human genome and the genomes of other species.