SUMMARY This grant renewal proposal is about developing innovative new software that will allow health researchers to take advantage of new advances in DNA sequencing. Over the last decade, technology advances have made DNA sequencing a routine and cost- effective method in many fields of life sciences research. The dominant technology today generates millions of short sequences, consisting of 75-300 base pairs (the ?letters? that make up the DNA sequence). These short ?reads? have to be assembled in the right order to make sense of the data. Dr. Birol and his team are world leaders in genome assembly, and the award- winning software they have developed (with support from their existing NIH grant and other funding) has been used in diverse DNA sequencing projects, including The Cancer Genome Atlas project. Newer technologies are now becoming available that generate information on much longer stretches of the input DNA as long or linked reads. Long read platforms can sequence over 100,000 base pairs per read, though with a very high error rate and low throughput. Linked read platforms can associate multiple reads over similar lengths, although the data contains many gaps. Still, if coupled with bioinformatics tools that can leverage the rich information they provide, these new sequencing platforms will open new frontiers in health research. Dr. Birol is seeking to renew his NIH funding so that he can develop specialized software that will quickly, accurately, and efficiently assemble and analyse long and linked sequence reads. These tools would provide advanced capabilities in a range of projects, such as tracking infectious disease outbreaks, using genetic information to select the best drugs to treat an individual patient's cancer, and other applications. The new tools will be made available online free for other non-profit researchers to use in their own sequencing projects, allowing teams around the world to make faster progress in health research.