Clostridium difficile infection (CDI) is frequently attributed to healthcare exposure and is associated with significant morbidity and mortality. While it is known that certain C. difficile strains and environmental factors such as a compromised gut microbiota are associated with greater virulence and poor prognosis, the full spectrum of CDI risk factors remains elusive. It is estimated that between 20-30% of CDI cases result from transmission between infected patients but given the paucity of data on colonized asymptomatic C. difficile carriers it is unlikely that these represent all transmission events. For patients with prior colonization, the extent to which this predisposes them to develop CDI is not fully understood. Our objective is to address these and other knowledge deficits in CDI transmission and pathogenesis in a prospective study of two high-risk patient cohorts at the Mount Sinai Hospital: patients admitted to intensive care units and liver/kidney transplant recipients. For each cohort we will obtain fecal samples on admission and at regular intervals during their hospitalization. Pre- and post-infection samples of 325 patients that develop CDI, as well as 650 time-matched controls, will be analyzed to an unprecedented level using high-throughput sequencing and screening approaches to characterize colonizing and infectious C. difficile isolates and the associated gut microbiome. Key to our approach are novel long-read genome sequencing technologies that enable rapid, cost efficient whole-genome assembly of C. difficile strains from fecal samples of CDI patients. Based on identified variants between these genomes, we will construct a low-cost screening protocol; the assay will detect distinct C. difficile strains, even when present at very low abundances, across our entire set of fecal samples. This will allow us to differentiate community from hospital-acquired infections and comprehensively map C. difficile transmission networks. Concomitantly, we will evaluate changes in the microbiome by performing deep metagenomic sequencing of fecal samples from CDI patients at the time of admission, as well as 16S sequencing from multiple time points prior to infection. The resulting data constitutes a large, longitudinal cross section of the microbiota on which to evaluate CDI progression with response to treatments such as antibiotics. We will integrate our high-resolution map of colonizing and infectious C. difficile isolates, and their corresponding microbiota background, with data from patient electronic health records to identify CDI risk factors and probe the impact of C. difficile genomic variation on disease. In addition to addressing major outstanding questions regarding the onset of CDI in healthcare settings, our project will deliver the most accurate predictive modeling of CDI to date. Moreover, the resulting dataset represents a tremendous genomic resource to the community in understanding the dynamic between C. difficile and the overall microbiome within patients. We anticipate our findings to have a major impact on treatment and infection control practices which will ultimately result in reduced CDI rates.