Computational Core - Project Summary Compound identification, particularly for LC-MS based metabolomics, has mostly been viewed as a challenge which must be tackled empirically. An alternative to exclusively experimental identification strategies is to leverage computational approaches to aid in interpretation of MS and MS/MS data. Computational approaches to interpretation of metabolite fragmentation under collision-induced dissociation (CID) type conditions are improving, both because of progress generating in-silico MS/MS libraries, and in terms of library search strategies which can better contend with the less-predictable nature and lower information content of small-molecule fragmentation. To make headway on identification of unknown compounds in existing and future metabolomics data sets, and to improve throughput and reduce cost of future compound identification efforts, it is essential to use these tools alongside experimental approaches. A second challenge contributing to the high proportion of unidentified features in untargeted metabolomics data is the abundance of redundant (degenerate) features in electrospray ionization mass spectrometry data, which include isotopes, in-source fragments and adducts. Presently existing tools are insufficient to contend with the complexity of fragments and adducts, both predicted and unknown, that have been demonstrated to occur in electrospray ionization mass spectrometry. A more useful tool would also facilitate analyst-aided interpretation of feature redundancy, an important step for the careful systematic compound ID workflow to be performed by MCIDC. Operating in coordination with the Administrative and Experimental Cores of MCIDC and with the Common Funds Metabolomics Consortium, the MCIDC Computational Core will help address major challenges in the field of untargeted metabolomics by carrying out the following Specific Aims: We will develop and apply a novel software tool, Binner, to reduce degeneracy of features in untargeted metabolomics data. Effective use of Binner will allow us to prioritize identification efforts on primary features, while allowing degenerate features to be indexed as such in metabolite spectral databases and be more rapidly removed from future data sets. Next, we will implement a novel probabilistic tandem mass spectral search strategy for small-molecule metabolites, including a ?Hybrid Search? approach. Our approach will allow detection of common structural motifs in unknown metabolites and aid in determination of their identity. Working with the experimental core, we will validate and refine this scoring algorithm using spectra of known metabolites from biological data.