The central goal of patient-centered medical care is to tailor treatment agreements toward individual patient risks, benefits, and preferences. This, however, cannot be possible without the ability to accurately predict an individual patient's risk of developing major medical illness. Whie the use of multivariable risk/benefit prediction tools to individually tailor treatments could grealy increase treatment precision, tools to facilitate such improvements in VA are lacking. For cardio- and cerebrovascular (CCV) disease, the leading cause of morbidity and mortality in the US, current risk prediction tools have substantial shortcomings, including requiring manual entry of risk factor information, being developed on and calibrated to patient populations quite different from those served by VHA, failing to utilize new data-mining techniques, and failing to utilize the full spectrum of clinical data available in VA's electronic medical record (VA EMR). This project will focus on primary cardiovascular prevention in Veterans. We will develop a VA-based risk prediction score and, using that score, novel clinical algorithms to tailor clinical decision-makin and risk/benefit communication to individual Veterans. We propose a 3.5-year project using 10-years of centrally available VA EMR data (2001 thru 2010) supplemented by the National Death Index, chart review, Office of Quality and Performance data, and non-VA cohort data to develop and validate the Veterans Affairs Risk Score (VARS). Our study has two specific aims: 1. To develop and assess two competing approaches to developing VA EMR- derived CCV risk prediction tools, using standard regression (REG) models and machine learning (ML) models. 2. To compare the accuracy and clinical impact of these VA EMR-derived CCV risk prediction tools (the REG and ML models) to each other and to commonly-used risk prediction models developed outside of VA, such as the Framingham and Euro SCORE risk tools. In addition to traditional measures of discrimination (such as the C statistic), potential improvements in clinica decision-making and patient outcomes using VARS will be assessed using reclassification analysis and the development of patient-based clinical decision analyses. Our methods will use national VA data to create a 10-year longitudinal cohort to develop the VA-specific CCV risk tool. We will extract laboratory and pharmacy data from the DSS National Data Extracts; data about outpatient visits, inpatient use, ICD-9 codes, and CPT codes from the SAS Medical Datasets; clinical measures from the Corporate Data Warehouse; and cause of death from the National Death Index. The REG models will be developed using Weibull survival analysis and the ML models will primarily use random forest ML methods (Aim 1). The models will be validated with VA-CMS datasets and National Death Index data. The data will be augmented with results from natural language processing tools. Data quality will be assessed with chart review and data from the Survey on Health Care Experiences of Patients (SHEP) and the Atherosclerosis Risk in Communities (ARIC) study. Aim 2 will use modern validation techniques, including risk reclassification analysis, to assess the reliability and accuracy of the new scores. We will also develop patient-based clinical decision analyses that will assess the risks and benefits of decisions. This work is the prerequisite research needed in order to develop automated tools, guidelines, and quality assessments that can be integrated into the VA EMR or a web-based interface (such as MyHealtheVet), helping clinicians and patients to optimize and personalize CCV risk reduction treatment decisions in the outpatient setting.