ABSTRACT Despite substantial progress in education, outreach, diagnosis, ART and PrEP use, HIV incidence among MSM in the U.S. has not decreased substantially in the last decade, with increases in new infections seen in younger MSM. There is an urgent need for novel approaches to identify the main sources of ongoing transmission. Our project leverages the existence of multiple, decentralized, HIV databases in Seattle that contain HIV pol genotypes from, in total, >4,000 individuals (>50% of the HIV-infected population), each linked to a range of behavioral, clinical, and epidemiological data (the databases are from Public Health?Seattle and King County, PHSKC; University of Washington HIV Information System, UWHIS; UW Primary Infection Cohort, PIC; and Seattle Children?s Hospital, SCH). From these databases, we will create a single, combined pol dataset, which will facilitate novel molecular epidemiological analyses with diverse HIV data types. Our creation of this combined dataset (via processes of de-identification, anonymization, record linkage and genetic analysis) will serve as a model for communities that have comparable decentralized HIV data sources and available HIV genotypes. Subsequently, we will use this dataset for innovative ?second generation? phylogenetic studies. First, we will describe HIV transmission patterns among key affected populations, including foreign-born Seattle residents and adolescent MSM. We hypothesize that phylogenetic linkage analyses will indicate that more than 80% of African- born cases, but less than 50% of Latino cases, are imported, with important public health implications. We also hypothesize that phylogenetic analysis of our combined dataset will reveal that adolescent MSM are frequently found in clusters with older MSM, at rates elevated above baseline expectations, indicating that age-discrepant pairs may play a role in driving transmission in this population in Seattle. Second, we will, in collaboration with Christophe Fraser and Oliver Ratmann (Imperial College London), estimate the proportion of transmissions in Seattle arising from each stage of the infection and care continuum. For this cutting edge analysis of the Seattle dataset, we will extend a data-driven clinical and phylogenetic methodology that was originally used to study MSM transmission in the Netherlands. We hypothesize that >50% of transmissions in Seattle arise from undiagnosed individuals, while transmission from patients not retained in care is less than previously estimated. Seattle is an ideal location to perform this project, as even with abundant public health resources, active and strong collaborations between academic and governmental communities, and great successes in the HIV care cascade, the epidemic persists.