Type 2 diabetes (T2D) and cardiovascular disease (CVD) are among the leading causes of morbidity and mortality in US Veterans, as well as the US population at large. T2D is a widely-recognized risk factor for CVD, and T2D leads to worse CVD outcomes. However, there remains considerable clinical heterogeneity among individuals with T2D. Even among individuals with apparently similar glycemic control, there is significant variability with respect to who will develop CVD. To develop more effective strategies to prevent CVD in this high-risk population, better approaches for quantifying CVD risk are needed. Using novel computational approaches, we will consider dense phenotype and genotype data to identify the subpopulations of individuals with T2D who are at the highest risk of heart and vascular disease. In Aim 1, the relationship between traditional CVD risk factors, such as cholesterol, blood pressure, and smoking, and three heart and vascular disease phenotypes: peripheral artery disease (PAD), coronary heart disease (CHD), and cerebrovascular disease, will be tested. To account for the fact that these outcomes frequently occur in the same individuals, statistical models that treat the traits as correlated-within person outcomes will be used. To determine if the addition of genetic information improves the prediction of CVD outcomes, the impact of genetic risk scores, based on preliminary studies from the VA Million Veteran Program and other published work, on the models will be assed. In Aim 2 dense phenotype data will be extracted from the electronic health record and novel artificial intelligence based biclustering algorithms will be used to identify hidden subtypes of T2D. The association of these subtypes with CVD outcomes will then be assessed. In Aim 3, a similar approach will be taken to elaborate T2D subtypes based on DNA variants known to associate with T2D, CVD, and their risk factors. Finally, the genetic and phenotypic data will be jointly considered. These approaches will be applied across data from both US Veterans, using the Veterans Aging Cohort Study and the VA population at large (via the Corporate Data Warehouse), and non-Veterans, using data from the PennMedicine BioBank, Penn Data Store, and UK Biobank. Successful completion of this project will help to elucidate the phenotype structure of T2D and identify individuals at the highest risk of T2D. These results will lay the ground work for developing tailored strategizes for CVD prevention in T2D and help realize the promise of precision medicine for heart and vascular disease.