Data Science Core ABSTRACT The objective of the Texas A&M Superfund Research Center is to explore and develop descriptive models and tools that can predict the possible hazardous outcomes of chemical exposure during environmental emergencies and to produce powerful solutions which can mitigate the negative effects on human health. The ultimate goal of the Center is to contribute to decision-making capabilities for planning and control in emergency environmental contamination events. The Data Science Core is one of the essential components of the Center that will contribute to achieving the goals of the Center by supporting the work of four challenging Research Projects. The projects will produce high-dimensional data that requires comprehensive analysis and expertise in state-of-the-art data science methodologies in order to translate raw experimental data into actionable insights and predictive models. Directed by Dr. Christodoulos A. Floudas and in collaboration with Co-investigator Dr. Fred A. Wright, the Data Science Core will provide numerous methods and services to the Center researchers under three specific aims: (i) by sharing expertise and providing support via advanced methodologies in data science and statistics; (ii) by developing high-performance, novel methods for simultaneous regression or classification with dimensionality reduction and data integration; and (iii) by constructing and maintaining a computational platform that will enable collaboration across the Center and facilitate dissemination of knowledge to the wider community and key stakeholders. Research Project 1 will characterize exposure pathways of contaminated sediments that are vulnerable to movement and re- deposition due to storm activity; the Data Science Core will provide services for experimental design, hypothesis testing, and regression for contaminated sediment binding experiments. Project 2 will study the mitigation of adverse health effects of chemicals through broad-acting sorption materials; the Data Science Core will utilize predictive modeling of sorption activity via advanced regression and simultaneous dimensionality reduction with nonlinear kernels to guide experimental design and material property identification. Project 3 will investigate the inter-tissue and inter-individual variability in response to complex environmental mixtures; the Data Science Core will apply composite classification and clustering strategies for characterization of chemical mixtures. Project 4 will develop single-cell, high-throughput platforms to quantify the endocrine disruptor potential of environmental contaminants and mixtures; the Data Science Core will aid in predicting the activity of multiple endocrine receptors through model construction and reduction of predictive models. Furthermore, the Data Science Core will maximize productivity within the Center by establishing an ideal environment for data sharing and collaboration via a computational platform service. The platform will also disseminate the results of the Center, including access to the final high-performance predictive models and tools, by providing interactive interfaces amenable for use by the scientific community.