The composition of the complex microbial communities inhabiting the human body has a tremendous influence on human health and disease. New DMA sequencing technologies offer the opportunity to study these communities by directly identifying the genome of individual microbial strains. Ideally, new sequences from metagenomic sampling can be compared to the known, full sequences of individual strains. It is estimated that there are more than 1,000 such reference strain sequences that are required and -600 that are either already determined, or else are in the process of being sequenced. The primary aim of this proposal is to generate reference DMA sequences of the remaining -400 strains to complete this reference catalogue. First, all 400 strains will be sequenced and assembled by shotgun methods. At least 60 will be finished to high quality using established methods and new approaches to convert the majority of the remainder to similarly high quality will be developed. The sequencing will use next-generation methods, based upon platforms developed by 454 Life Sciences, Illumina/Solexa and Applied Biosystems. All of the sequences wiN be annotated by an automated pipeline, and the 60 that are 'finished'will also be manually curated. The sequencing methods will also be applied to a selection of viral and fungal targets. Metagenomic sampling approaches will be tested using existing samples and methods, and these data will be refined by development of new technologies for selective DMA isolation and 16S and WGS sequencing of metagenomic samples, including microarray DMA chip capturing, electrophoretic techniques and cDNA tests. When all technical advances are combined, the cost of sequencing an individual bacterial genome will be less than $1,000.