Summary. Suicide in youths is a growing health concern, yet current clinical practice falls short of timely identifying youths at risk for suicide attempt (SA). The overarching aim of this research is to use data driven machine learning methods to facilitate primary prevention of youth SAs in primary care pediatric settings. Clinical guidelines recommend screening for depression, considered a proxy for suicide risk, from age 12 in pediatric setting. The proposed study aims at identification of variables (features) that can be collected by early adolescence, and contribute to prediction of SA in later adolescence. This study will leverage the effort that has been invested in previous projects: a study using electronic health records (EHR) to predict SAs and deaths in University of Pittsburgh Medical Center (UPMC) hospitals; and the Philadelphia Neurodevelopmental Cohort (PNC), that included comprehensive phenotyping of ~9,500 youths. These previous efforts will be integrated to develop and optimize SA prediction in youth from the Children?s Hospital of Philadelphia (CHOP) network, from which we have data on ~40,000 who were screened for a history of SA between the years 2014-2018 (n~1500). First, in the CHOP dataset, we will generate predictive models based on UPMC data, test their predictive validity in CHOP youth population, and then develop, optimize, and cross validate these predictive models using CHOP EHR data as a training set (Aim 1). Second, in the PNC dataset, we will use multiple data types (demographic, behavioral, cognitive, imaging) to classify youths with suicide ideation (SI, n~750) and identify features (potentially modifiable) that are indicative of SI and may also point to potential mechanisms underlying youth SI (Aim 2). Lastly, in a subset of 936 youths (49 with SAs) with both CHOP EHR data and research PNC evaluation that was conducted at mean age 11 (T1), ~5 years before SA screening (T2), we will test the validity of models from Aims 1&2, and aim to identify data features that were collected at T1 and can improve/optimize/outperform the prediction of SAs that relies solely on EHR data (Aim 3). The proposed study relies on the expertise of a highly capable multidisciplinary team comprised of Dr. Barzilay (PI), child- adolescent psychiatrist experienced in suicide research and analysis of suicide related phenotypes in PNC data; Dr. Tsui (PI), an expert in machine learning who has developed predictive algorithms of SA and deaths using UPMC data; and collaborators critical for meeting study aims, Dr. Raquel Gur as the lead researcher who established the PNC, Dr. Ruben Gur who developed the PNC neurocognitive assessment tools, and Dr. Oquendo who will provide expertise in suicide prediction research. The team?s access and familiarity with CHOP EHR and PNC data resources, coupled with its interdisciplinary expertise, creates a unique opportunity to identify childhood features that can optimize later adolescent SA prediction. Expected findings can ultimately translate to real world clinical practice, be integrated in EHR, and help flag youths at risk for a SA in a pediatric setting, allowing timely identification and intervention, contributing to the mission of reducing suicide in youth.