We aim to provide a comprehensive foundation for development of an ultra-low-cost, ultra-fast nucleic acid polymer sequencing technology based on single-atom resolution transmission electron microscopy (TEM) of heavy atom-labeled nucleic acid polymers. Our particular approach is based on TEM imaging of ultra-dense (3 nm strand-to-strand spacing) parallel arrays of high molecular weight ssDNA molecules labeled base- selectively with heavy atoms. This will allow read lengths of at least ~150 kb and potentially as much as 2-4 Mb or more, with no special difficulties posed by highly repetitive DNA. With appropriate optimization, automation, and scaling, and with further funding beyond the scope of this proposal, this technology ("TEM sequencing") will potentially enable human genome sequencing at significantly lower cost and with much greater speed and consensus accuracy/completeness than other proposed third- generation sequencing approaches. Our project will involve the optimization of our novel ssDNA array deposition protocol, improvement of imaging conditions and substrate quality, and subsequent design and building of a prototype TEM sequencing system with which we hope to demonstrate the approach's potential by delivering a human reference genome assembly that we believe may possess unprecedented consensus accuracy and completeness due to the inherently extreme read lengths and high coverage enabled by the approach. PUBLIC HEALTH RELEVANCE: The pace and impact of biomedical research on improving human health may be greatly increased by the development of ultra-low-cost, ultra-high-quality genome sequencing. Our electron microscopy-based approach employs preparation and readout unbiased by sequence content with extremely long read lengths (at least 150,000 bases and potentially as much as 2-4 Mb), suggesting that nearly gapless assemblies will be achievable, shedding light on previously unassembled long repetitive regions and structural variations with potentially important roles in complex disease. Furthermore, our models indicate that TEM sequencing may enable sequencing of whole human genomes to >99.9999% consensus accuracy and completeness in <10 minutes/genome, at a cost of <$100, and thus its realization may have a broad, near- term, lasting impact on biomedical research. )