The investigator will develop a software tool for assembling DNA fragments generated in megabase- scale shotgun sequencing projects. The software will be tested first on DNA fragments generated by computers from megabase DNA sequences and then on real DNA fragments from large-scale sequencing projects. The software will be freely distributed to nonprofit organizations. The investigator will assist the integration of the software into sequencing environments at genome centers. The objective of this project will be achieved by making two major improvements to a DNA sequence assembly program developed previously. The first improvement is to develop a strategy for solving the problems caused by repetitive sequences. In this strategy, all the fragments from a repetitive sequence are identified, and the uncertainties in assembly of the fragments are resolved using additional information on the fragments that flank copies of the repetitive sequence. The second improvement is to increase the capacity of the assembly program by developing a parallel version of the program in the PVM parallel programming environment on a local network of computers. The investigator will parallelize the two most time-consuming parts of the sequential program, the detection of overlaps among fragments and the construction of fragment alignments for contigs. The parallel sequence assembly program will be able to use the computation power of many computers to assemble tens of thousands of DNA fragments into sequences of low error. The investigator will improve the multiple sequence alignment program by addressing reading frame shifts in comparison of protein, cDNA and genomic DNA sequences.