PROJECT SUMMARY: This proposal aims to characterize the associations between prenatal exposure to interpretable combinations of air toxics and children?s cognitive health through the efficient use of big public health data. With guidance from multidisciplinary advisors, the candidate will develop skills in data science, machine learning and advanced biostatistics to supplement her training in epidemiologic methods. This will allow her to progress in her career and advance research on combined environmental exposures and children?s health. Previous research has found associations between prenatal exposure to single air pollutants and children?s cognitive health but has lacked the ability to investigate combined impacts of multiple pollutants, including the synergistic/antagonistic interactions between pollutants that have been observed in experimental studies. Understanding the effects of combined exposures is a strategic goal of the National Institute of Environmental Health Sciences, and the field of environmental health is transitioning from single-pollutant approaches to more holistic paradigms, such as the exposome. Identifying associations and interactions within the context of high-dimensional exposure data presents a computational challenge. Methods from domains such as data science, including machine learning methods, can be incorporated into the epidemiologic toolbox for addressing environmental mixtures and multiple exposures. The goal of this Career Development Award is to advance the candidate into an independent research career at the intersection of big data science and children?s environmental health. Through formal coursework, directed learning and field rotations, the candidate will gain skills in data science, machine learning and advanced biostatistics. Mentors, advisors and consultants have been selected for their complementary expertise, relevant research experience and mentoring abilities. The proposed research will leverage the skills gained from the training plan and apply them to characterize associations between prenatal exposure to interpretable combinations of air toxics and 3rd grade standardized test scores, a school-based measure of cognitive outcomes. Residence at birth will be used to link data on air toxics, a subset of air pollutants, to an administrative data linkage of public health registries and education data for approximately 220,000 children born in New York City from 1994-1998. The candidate will develop and validate a two-stage approach of hypotheses generation followed by targeted analyses in order to identify combinations of air toxics associated with children?s test scores within the context of high-dimensional exposure data (Aim 1). Targeted analyses using well-established epidemiologic methods for effect estimation and assessment of interaction between air toxics will be performed. (Aim2) Potential mediators of the relationship between air toxics and test scores can then be identified using statistical mediation and data science approaches. (Aim 3) Completion of these aims will uniquely position the candidate to conduct future research on combined environmental exposures and children?s health.