High-throughput omics analysis platforms provide important new technical capabilities to analyze complex biological fluids for disease markers. A particularly important platform for metabolomics is two-dimensional gas chromatography time-of- flight mass spectrometry (GCxGC/TOF-MS) that provides both high precision and extended capacity for molecular detection. However, data analysis from this platform is not developed for high sample numbers or for detection of the very large numbers of analytes that this technology detects (more than 1500 per biofluid sample). We will develop an informatics platform for accurate and efficient analysis of data generated from GCxGC/TOF-MS analyses through the following specific aims: (1) Mass informatics for differential metabolomics, (2) Interactive visual analysis of metabolite correlation networks, and (3) Informatics analysis tools integration. The biological information that our system will deliver is regulation of altered metabolites that relate to phenotypic differences between groups of samples. This information will enable assessment of human health and wellness, and will have a direct impact on public health. In the context of existing collaborations, our initial large scale application focus will be analyses of metabolomic profiling data from plasma samples of patients with breast and prostate cancer compared with control subjects. Broadly speaking, our informatics tools will facilitate efforts in the metabolomics community to perform comparative metabolite profiling with high precision and high volume on the powerful GCXGC-TOF-MS platform. The project consists of the development of algorithms for assessment of metabolite identification accuracy, alignment of spectra, and normalization of datasets. Separate algorithms will be developed to distinguish specific molecules that are observed to be different between different sample cohorts and to provide a statistical significance context for each differentially expressed molecule. An additional group of algorithms will be developed to enable interactive visual analysis of metabolite correlation networks. Deliverables from this project will include provision of independent software modules that can be remotely invoked via web services and an integrated web-based data analysis pipeline for GCxGC/TOF-MS data. PUBLIC HEALTH RELEVANCE: High-throughput molecular omics analysis platforms provide important new technical capabilities to analyze complex biological fluids for disease markers. A particularly important platform for metabolomics is two-dimensional gas chromatography time-of- flight mass spectrometry (GCxGC/TOF-MS) that provides both high precision and extended capacity for molecular detection. This approach usually uses a short polar column after the main analytical column. Typically, the second column is operated at a higher temperature than the first column. The metabolites co-eluted from the first GC column are further separated in the second column because of the difference of column temperature and the chromatography matrix. The further separated metabolites are directed to a high capacity time-of-flight mass spectrometry system for detection. Two- dimensional gas chromatography offers significant advantages for analysis of complex samples including: an order-of-magnitude increase in separation capacity, significant increase in signal-to-noise ratio and dynamic range, and improvement of mass spectral deconvolution and similarity matches. Since GCWGC/TOF-MS can provide more and accurate information, it represents powerful tool for the analysis of small molecule metabolites in complex biological systems. However, data analysis from this platform is not developed for high sample numbers or for detection of the very large numbers of analytes that this technology detects (more than 1500 per biofluid sample). We will develop a set of computational algorithms for accurate and efficient analysis of data generated from GCxGC/TOF-MS analyses through the following specific aims: (1) Mass informatics for differential metabolomics, (2) Interactive visual analysis of metabolite correlation networks, and (3) Informatics analysis tools integration. The biological information that our system will deliver is regulation of altered metabolites that relate to phenotypic differences between groups of samples. This information will enable assessment of human health and wellness, and will have a direct impact on public health. A direct application of this type of information is metabolite biomarker discovery in clinical biomedical research. In the context of existing collaborations, our initial large scale application focus of the developed mass informatics system will be analyses of metabolomic profiling data from human plasma samples of patients with breast cancer and human multiple myeloma, compared with corresponding control subjects. With the raw instrument data as input information, our informatics tools will perform a series of data mining and discover the metabolites present in the sample and also provide their regulation information. These informatics tools will facilitate efforts in the metabolomics community to perform comparative metabolite profiling with high precision and high volume on the powerful GCXGC-TOF-MS platform. The project consists of the development of algorithms for assessment of metabolite identification accuracy, alignment of spectra, and normalization of datasets. Separate algorithms will be developed to distinguish specific molecules that are observed to be different between different sample cohorts and to provide a statistical significance context for each differentially expressed molecule. An additional group of algorithms will be developed to enable interactive visual analysis of metabolite correlation networks. Deliverables from this project will include provision of independent software modules that can be remotely invoked via web services and an integrated web-based data analysis pipeline for GCxGC/TOF-MS data.