Program Director/Principal Investigator (Last, First, Middle): Salleb-Aouissi, Ansaf Summary Prediction of Preterm Birth (PTB) has been an exceedingly challenging problem, predom- inantly due to the inherent complexity of its multifactorial etiology and the lack of approaches capable of integrating and interpreting large multidisciplinary data. It is a major long-lasting public health problem with heavy emotional and ?nancial consequences to families and society [ , ]. PTB is the leading cause of mortality and long-term disabilities among neonates. Most studies to date have examined 7 20 individual risk factors through univariate analyses of their coincidence with PTB. Our previous work [NSF Eager 1454855, 1454814] developed predictive models for PTB based on non-genetic maternal attributes [ , 30 29 ]. A particularly challenging population to determine PTB risk is ?rst time mothers (nulliparous women) due to the lack of prior pregnancy history. An important question is to know whether factors other than history of PTB can be used to identify a nullipara patient at risk. Speci?c aims of the original project Our basic speci?c aims are as follows: (1) Longitudinal Preterm Birth Prediction: We will ?rst build a series of accurate prediction models for PTB using the nuMoM2b dataset. Such models will handle the challenges common to medical datasets including (a) imbalance in the classes, (b) missing data, and (c) disparity in data collection. We will achieve this by designing an objective function for Support Vector Machines that captures and corrects for these issues. Second, by leveraging the availability of patient future data, our Learning Under Privileged Infor- mation (LUPI)-based approach [ ] will signi?cantly increase the rate of convergence of the algorithms 22 and improve prediction with less data. Our transformative approach is well-suited for medical datasets that are both limited by the number of patients and inherently include the challenges mentioned above. (2) Combining clinical and genetic features for risk prediction: In this aim we tackle questions of causality between the genetic information and its various forms of phenotypic implications by leveraging the phenotypically rich nuMoM2b dataset. We will ?rst apply standard GWAS analysis to apply new insight regarding the changing patterns of genetic association as additional phenotypic data is accumu- lated as well as serve as a baseline. We will then seek to develop improved analysis of involvement of genetic contributions in PTB. (3) Clinical and social impact: We plan to assess the effectiveness of the methods in clinical practice by: (a) testing the effectiveness of the longitudinal models produced in objective 1 and 2 on existing clinical data at the New York Presbyterian Hospital. (b) building a sequential decision making model; this includes optimizing the scheduling of patient visits and diagnostic testing tailored for different classes of patients. Speci?c aims for this NIH Supplement We will extend the original aims to the following: (1) Interpret- ing PTB prediction models: we propose to explore the interpretability of our best performing models in an effort to understand ?PTB mechanisms.? Our approach is to probe the most accurate models to assess which risk factor or feature are the most important and how they combine to lead to the PTB outcome. We will explore characteristic rules, along with in?uence functions to identify the instances that are driving prediction, (2) NetWAS Analysis for PTB: we will integrate NetWAS analysis into the study of PTB mechanisms and genetics. We will focus on the uterine expression levels available for automated NetWAS analysis from the uterus, and uterine cervix/endometrium, and integrate NETWAS- computed expression levels as features to the prediction models, and (3) Studying dynamic treatment regimes for PTB: We will investigate models that use reinforcement learning and causal reasoning to learn appropriate dynamic treatment policies in the context of confounding bias in the nuMoM2b data. This will in turn affect scheduling of tests and visits based on the individualized patient pro?le. Broader Impact Over 26 billion dollars are spent annually on the delivery and care of the 12% of infants who are born preterm in the United States. A crucial challenge is to identify women who are at the highest risk for very early preterm birth and to develop interventions. Equally important, would be the ability to identify women at the lowest risk to avoid unnecessary and costly interventions. Our project has the potential to advance knowledge about this long-lasting public health problem. We will recruit female and minority students to this project. Research results will be disseminated through courses, conference and journal papers, and through undergraduate research and graduate students' theses. OMB No.0925-0001/0002 (Rev. 01/18 Approved Through 03/31/2020) Page 1 Continuation Format Page