Project Summary Cardiovascular disease negatively affects millions of people worldwide. Globally, it accounts for approximately thirty percent of all deaths. Furthermore, a significant fraction of deaths caused by cardiovascular disease occur in a non-geriatric population; fifteen percent of all worldwide deaths are attributed to cardiovascular disease for people under the age of seventy. Treatment to prevent cardiovascular events should be based on highly individualized risk prediction. High risk patients should get more aggressive treatments because the risk of disease outweighs the burden of treatment, while low risk patients should be managed more conservatively. For example, anti-thrombotic therapy for coronary heart disease may increase bleeding risk and may not be appropriate for low-risk patients. Two primary kinds of cardiovascular disease are stroke and coronary heart disease, and there have been a number of developments in risk scores for both ailments. However, these risk scores only use a small fraction of the available measurements about a patient and treat risk as a collection of independent factors rather than considering how their interactions amplify or ameliorate risk. Moreover, a majority of the popular coronary heart disease and stroke risk scores are designed to be manually computed by a busy physician at the point of care, which further limits their scope and fidelity. Next generation risk scores for stroke and cardiovascular disease should take into account all of the available information in the electronic health record without the constraints of the parametric assumptions of traditional risk modeling. More accurate risk assessment of coronary heart disease and stroke will lead to better care and reduce the cardiovascular disease burden. Our vision is to capitalize on large collections of electronic health records along with recent advances in deep learning to build risk scores that use more available health information while making minimal mathematical assumptions about the nature of clinical risk. Our proposal propels the field from human computable independent risks calculations necessitated by previous limitations of technology to calculations that make use of deep learning to learn highly nonlinear risks and risk factor interactions. We additionally demonstrate how deep learning can be used to deal with the ever-present issue of missing values in medicine. Our proposal also targets an area under- explored by previous work on risk scores: fairness. Treatment quality is affected by the quality of risk estimation. This means populations where estimated risk is less accurate may receive worse care. Risk scores developed with simple models may only capture risk accurately for the majority population as simple models are not flexible enough to cover multiple populations. We seek to identify potential risk calculation differences with respect to race and ethnicity. We will construct and evaluate deep learning methods for coronary heart disease and stroke risk assessment from electronic health records. We will develop techniques to incorporate clinical text, handle missing data, and evaluate fairness of deep learning for cardiovascular risk scores. Finally, we will make our work available as open source code written in deep learning frameworks, at clinical conferences, and publications.