The goal of this project is to determine the complete DNA sequence of the chromosome of Mycobacterium tuberculosis, the causative agent in tuberculosis. About one third of the world's population is believed to harbor latent tuberculosis. Despite little knowledge about the biology of M. tuberculosis, tuberculosis was widely thought to be under control from the 1950's through the 1980's through the use of multiple drug therapy and better living conditions. However, since the mid 1980's there has been an alarming increase in the rate of tuberculosis infection in developed as well as in developing nations that is primarily attributable to the expanding HIV epidemic, decaying public health services, and the emergence of multiple-drug resistant tuberculosis. Today, tuberculosis remains the leading cause of death due to infection worldwide. It is clear that the complete genomic DNA sequence and a set of recombinant clones would provide a tremendous resource for the study of M. tuberculosis. With the recent demonstration that the complete genome sequence can be obtained from microbial organisms rapidly and cost effectively, the M. tuberculosis genome project is now feasible. The approach will be a whole genome random sequencing strategy. This will be accomplished by constructing a random small insert plasmid library from M. tuberculosis strain H37Rv and sequencing the ends from approximately 35,000 clones (70,000 sequence fragments). The sequencing of the ends of a set of available minimally overlapping cosmid clones will provide a scaffolding structure that will minimize the effort associated with gap filling and provide confirmation of the underlying assembled structure. The assembled genome will then be annotated by identifying a variety of structural features as well as assigning genes and functional roles to open reading frames based on database similarity searches. This approach will accelerate studies in understanding the biology of M. tuberculosis and will impact for example vaccine development, identification of virulence genes, understanding mechanisms of multiple drug resistance, and rational drug therapy. The data developed from this study will be deposited in McyDB which will allow researchers to access the large amount of information on sequences, functions, clones, and other physical map features that will be generated and linked to other features such as antigen, antibody, gene locus and MedLine references.