Despite substantial efforts in developing sequencing technologies and computational software, spanning over 30 years, the full genome of any but the simplest organisms is still unable to automatically reconstructed. The length of the DNA sequences that can be 'read' by modern sequencing systems is substantially smaller than the length of most genomes (1000s of base-pairs versus millions to billions), making it virtually impossible to use the fragmented information generated by the shotgun sequencing process to reconstruct the long-range information linking together genomic segments belonging to a same chromosome. The main reason why genome assembly is difficult is genomic repeats - segments of DNA that occur in multiple identical or near-identical copies throughout a genome. Any repeats longer than the length of a sequencing read introduce ambiguity in the possible reconstructions of a genome - an exponential (in the number of repeats) number of different genomes can be constructed from the same set of reads, among which only one is the true reconstruction of the genome being assembled. Finding this one correct genome from among the many possible alternatives is impossible without the use of additional information, such as mate-pair information constraining the relative placement of pairs of shotgun reads along the genome. Mate-pair information is routinely generated in sequencing experiments and has been critical to scientists' ability to reconstruct genomes from shotgun data (e.g., mate-pair information was crucial to the success of the first prokaryotic genome project - Haemophilus influenza). Given these outstanding issues, a series of interlocking aims is proposed that center on enhanced optical and electronic detection of specially-decorated, genomic DNA molecules. The aims are designed for enabling new technologies that will provide sufficient physical map information to intimately mix with modern sequencing data for comprehensive assembly of complex genomes. These proposed advancements will be cradled within a new generation of nanofluidic devices engendering novel means for molecular control and detection. Such efforts will be directed by state-of-the art computer simulations that will model novel aspects of the new platforms for allowing rapid loops of design/implementation/testing. The main thrust of these technological developments will be carefully guided and serve a broad-based bioinformatics framework that will be developed for this work while laying the basis for highly integrated approaches to genome assembly and analysis. PUBLIC HEALTH RELEVANCE: Development of new machines and software is proposed, which will rapidly analyze a person's genome and reveal new types of information that doctors will be able to use for treating patients. The machines that will be developed are actually very small devices that may one day be sufficiently miniaturized to fit in a person's hand.