We propose to sequence the genome of Trypanosoma cruzi, a human pathogen of global importance, as part of an interactive research project. This project, and the other components of the interactive project, are designed to exploit ongoing genome sequencing projects on three related pathogens; Leishmania major at the Seattle Biomedical Research Institute (SBRI), Trypanosoma brucei at the Institute for Genome Research (TIGR), and T. cruzi at Uppsala University (UU). The project will also be conducted in coordination with the genome networks that were organized by the World Health Organization to sequence the genomes of these three pathogens. The first phase of the project entails end-sequencing of BAC clones from the CL-Brener reference strain to generate sequence covering about 16 percent of the genome in this lab (55 percent of the genome in the three labs combined). This phase, which will be completed in the first year, will generate data and materials for karyotype and contig mapping, complement data in the existing T. cruzi EST database, and identify many of the genes in T. cruzi. A sequence backbone for selected chromosomes will be generated from shotgun (4x) sequencing of a minimum tile path of BAC clones using a "map-as-you-go" strategy to generate approximately 96 percent sequence coverage of each chromosome. Chromosomes will be assigned to the interactive labs on the basis of interest, which will largely reflect SBRI's activity in L. major sequencing and TIGR's T. brucei sequencing. Comparison of these sequences with those of the corresponding completed chromosomes from L. major or T. brucei will extend our knowledge of the degree of synteny and will provide a basis for determining which regions will be sequenced to contiguity. BAC clones will be sequenced to approximately 8x coverage with highest priority given to those containing genes with high biological significance, that are unique to T. cruzi, or uncharacterized in the other related species. Clones with substantial repetitive sequence will not be sequenced to high coverage without a compelling biological basis. In this way, the genomic regions of the highest biological relevance will be sequenced to contiguity, resulting in approximately 10-15 Mb of highly accurate sequence for each lab. An additional approximately 2-5 Mb of moderately accurate non- contiguous sequence will also be obtained from each lab. The sequence will be analyzed and annotated and these data will be made available to the research community. This project will accelerate the acquisition of the complete genomic sequence of these three organisms and reduce the cost of each project. It will determine relative genomic organizations, gene repertoire and sequence conservation among three important pathogens. This information will aid the development on diagnostic, therapeutic, and preventative measures, as well as further our understanding of fundamental molecular phenomena.