The unexpectedly high diversity and rate of evolution of UPEC are contributing to the difficulty in understanding molecular mechanisms underlying recurrent UTI. Advances in sequencing, algorithm development, and computing power provide an opportunity to elucidate the genetic and molecular basis for recurrent UTI using comparative genomics. The work proposed in Project 3 for the next funding period is designed to capitalize on the complementary expertise and biospecimen resources available in this SCCOR program. Gut and urine-associated E. coli strains recovered from three patients with recurrent UTI, enrolled in the clinical trial described in Project 2, will be selected and sequenced using the highly parallel 454 pyrosequencer. The sequence data will be used to examine E. coli evolution within a given individual, in two host habitats, over the course of three separate episodes of infection, and between different individuals. Follow-up studies will involve (i) quantitation of gene prevalence in isolates recovered from the two ecosystems (using new comparative pan-genomic methods and PCR-based sequence surveys), (ii) resequencing of genes statistically enriched in urine isolates to determine, based on maximum likelihood and parsimony algorithms, whether they are under positive selection in UPEC strains; and (iii) performing functional annotations. Genes that are (i) more prevalent in urine isolates than in fecal isolates within individual patients; (ii) more prevalent in UPEC strains than non-UPEC strains in a broad panel of clinical isolates; (iii) under positive selection in UPEC strains; and (iv) have functional annotations suggestive of a role in pathogenesis will be tested in a mouse model of UTI in collaboration with Project 1.