Computational toxicology has become a critical area of research due to the burgeoning need to evaluate thousands of pharmaceutical and environmental chemicals with unknown toxicity profiles, the high demand in time and resources by current experimental toxicity testing, and the growing ethical concerns over animal use in toxicity studies. Despite tremendous efforts, little success has been attained thus far in the development of predictive computational models for toxicity, primarily due to the complexity of toxicity mechanisms as well as the lack of high-quality experimental data for model development. A critical challenge in toxicity testing of chemicals is that toxicity effects are doe-dependent: the true toxic hits may show no toxicity at all at low dose level. Therefore, traditiona high-throughput screening (HTS) that test chemicals only at a single concentration is not suitable for toxicity screening. On the contrary, the recently developed quantitative high-throughput screening (qHTS) platforms can evaluate each chemical across a broad range of concentrations, and is gaining ever-increasing popularity as a tool for in vitro toxicity profiling The concentration-response information generated by qHTS are expected to provide more accurate and comprehensive information of the toxicity effects of chemicals, offering promising data that can be mined to estimate in vivo toxicities of chemicals. However, our previous studies showed that if processed inappropriately, such concentration-response information contribute little to improve the toxicity prediction. This is especially true when multiple types of qHTS data are used together. Therefore, in this study, we will extend our previous approaches to develop novel statistical and computational tools that can curate, preprocess, and normalize the concentration-response information from multiple different qHTS databases. Traditionally, toxicity models are based on either the chemical data (such as the quantitative structure- activity relationship analysis), or the in vitro toxicity profiling data (such as the in vitro-in vivo extrapolations). Our previous experiences suggested that integrating biological descriptors such as the in vitro cytotoxicity profiles or the short-term toxigenomic data, with chemical structural features is able to predict rodent acute liver toxicity with reasonable accuracy. Therefore, the second part of this proposal will be devoted to develop novel computational models for hepatotoxicity prediction by integrating qHTS toxicity profiles and chemical structural information In Aim 1, we will curate, preprocess, and normalize collected public liver toxicity datasets. In ths study, we will model toxicity effects using multiple large public datasets such as HTS and qHTS bioassay data (Tox21[1] and ToxCast[2]), hepatotoxicity side effect reports on marketed failed drugs[3], the Liver Toxicity Knowledge Base Benchmark Dataset (LTKB-BD[4]), etc. Statistical methods for cross-study validation and quality control will be applied to the collected datasets to ensure computational compatibility and to select the appropriate datasets for analysis. In Aim 2, we will develop predictive models for chemicals' liver toxicity based on an integrative modeling workflow that will make use of both structural and in vitro toxicity profiles of a chemical. Our previous studies [5] showed that models using both in vitro toxicity profiles and chemical structural data have better accuracy for rodent acute liver toxicity than models using either data type alone. Here, we will develop a novel modeling workflow that start with defining the functional clusters of chemicals via curated qHTS toxicity profiles, and is followed by developing computational models to correlate chemical and biological data with overall toxicity risks in humans. The predictive models will be validated using independent datasets with over 800 compounds. In Aim 3, we propose to prioritize the qHTS profiling assays used in the model for future toxicity testing. We will evaluate all the in vitro assays as biological descriptors from thee perspectives, including descriptor importance in the integrative toxicity model, correlation with i vivo DILI outcomes, and level of information content estimated by a novel approach based on network analysis.