The protozoan parasite Toxoplasma gondii is a major cause of congenital birth defects, a prominent opportunistic infection associated with AIDS, and an important veterinary pathogen. Beyond its direct clinical importance, the experimental accessibility of this parasite also provides an outstanding model for studying related species, including Plasmodium (the causative agent of malaria), Cryptosporidium (a devastating AIDS pathogen for which no effective therapy is available), and numerous veterinary pathogens. In addition, T. gondii has proved useful for studies on the origin, diversity and evolution of eukaryotic cells and organelles. Because the 80 Mb size of the haploid T. gondii genome makes generating finished sequence for the entire parasite genome an overly expensive proposition, this application proposes to jump-start gene discovery for drug and vaccine design by shotgun sequencing of the entire genome to more than 3X average coverage. Previous experience with T. gondii genomic (and cDNA) sequences have revealed no difficulties in sequencing this organism?s DNA, and reconstructions suggest that the proposed level of coverage should be sufficient to identify greater than 95 percent of parasite genes. The B7 strain (a recent clonal isolate of the type II parasites most commonly associated with AIDS) will be used for this study. This isolate has been shown capable of traversing the entire parasite life cycle. Plasmid and BAC libraries with various insert sizes (2 kb, 10 kb, and 80-100 kb) will be employed to facilitate contig linking in the final sequence assembly. DNA and predicted gene and protein sequences will be made accessible to the community for web-based data download, BLAST and motif queries, genome browsing, and relational searches based on predicted gene structure, protein motifs, and metabolic role; chromosomal location (when known); text-based queries of pre-computed similarity searches against GenBank and other databases; relationships to additional apicomplexan parasites and other taxa; etc. Gene models will be generated using algorithms trained on validated T. gondii splice junctions determined from comparison with existing EST sequences, and by RT-PCR analysis. The worldwide T. gondii research community is solidly behind this effort, and the history of this group (in supporting and successfully exploiting the T. gondii EST project, for example) bodes well for successful utilization of the genomic sequence.