Helicos BioSciences Corporation has developed a fully automated instrument capable of sequencing single molecules of DNA on a planar surface. Helicos is now developing a high-throughput version of this capability for the re-sequencing of whole human genomes. The sequencing strategy involves obtaining short reads (about 25 bases) from billions of strands of DNA, immobilized on a surface inside a reagent flow cell. The research plan aims to address certain limitations of this strategy for the re-sequencing of highly variable genomes with high accuracy, and for the eventual de novo assembly of never before sequenced genomes. The ability to sequence with a high-accuracy will be achieved by enabling a two-pass sequencing strategy (Specific Aim 1). This entails obtaining two reads from the same position on the same strand. A two-pass strategy will require covalent attachment of template strands to the surface in a stable and biocompatible fashion. Secondly, the ability to sequence highly variable or aberrant genomes will be addressed with the development of a paired-end read approach (Specific Aim 2). This involves acquiring two distal about 25 base reads from the same strand of DNA. The distance between the two reads will be limited to a range by the restricted addition of natural nucleotides. The bioinformatics required to benefit from this supplemental information will be developed in parallel with the sample preparation and sequencing efforts. Lastly, de novo genome assembly will be enabled by combining the two-pass sequencing strategy to the paired-end read strategy (Specific Aim 3). This will result in the generation of virtually long reads, composed of multiple 25 base reads from the same strand of DNA on the surface. The advances made in the two previous specific aims, will contribute greatly to the success of this third aim. All experiments will be carried out using the prototype single molecule sequencing instruments. We will finally demonstrate the utility of these strategies by sequencing and assembling a 4 Mb bacterial genome de novo with an average depth of 10X, a coverage of >90% and an error rate of <0.5%. The long-term objective of this research plan (within 10 years of funding) is to develop a robust single molecule sequencing system, which is capable of sequencing a human genome, and detecting all genetic variations and aberrations in that genome, for a fully loaded cost of $1000, at a throughput of 8 genomes per instrument per day. The Helicos system will thereby enable applications that are too costly or difficult to carry out with current technologies including: the sequencing of whole human genomes from normal or tumor tissues, the genome-wide assessment of epigenetic changes, and the digital expression profiling of thousands of normal and diseased tissue types. Ultimately these methods will yield advances in the fields of cancer and complex disease genetics/genomics, and will result in the use of genomic information in the diagnosis, treatment and prevention of disease. [unreadable] [unreadable] [unreadable]