Project Summary The last decade has seen two complementary trends: (i) technology to perform untargeted metabolomics with liquid chromatography/mass spectrometry (LC/MS) has become readily available to most investigators, and (ii) interest in metabolism has continued to heighten in many disparate research fields ranging from cancer and immunology to neuroscience and aging. Accordingly, the number of investigators who are acquiring untargeted metabolomic data with LC/MS is dramatically increasing. Yet, informatic tools to analyze the acquired data have lagged far behind and interpretation of the results remains a serious challenge, even for experienced users. Thus, there is a substantial number of investigators performing untargeted metabolomics with LC/MS who either cannot interpret the data generated or, even worse, are interpreting it incorrectly. When untargeted metabolomics is performed on a typical biological sample, it is common to detect thousands to tens of thousands of signals (aka features). Translating these signals into metabolite names is the biggest informatic barrier limiting biomedical applications of the technology. The process is arduous, particularly for inexperienced investigators, because the majority of signals detected do not correspond to non- redundant metabolites originating from the biological sample. Rather, most signals (up to 95% in some of our experiments) are due to complicating factors such as contaminants, artifacts, fragments, etc. Because many of these complicating signals are not currently in metabolomic databases such as METLIN, they can be challenging to annotate for inexperienced users. While there are software programs available to annotate the signals within the data, these tools are beyond the reach of most clinical and biological investigators because (i) they are not automated with a graphical user interface, and (ii) they rely on a costly experimental design involving isotopes to find contaminants and artifacts. We propose to develop an automated solution to name and quantify most of the metabolites detected in untargeted metabolomic LC/MS experiments. Our strategy is to assume the computational burden of completely annotating all detected metabolites in untargeted metabolomic data, which only has to be performed once for a given sample type, so that less-experienced investigators do not have to in their future experiments. We will completely annotate untargeted metabolomic data sets from different biological samples using the mz.unity software and credentialing technology developed by the Patti lab. Based on experiments that we have already performed, we expect to find ~5,000 unique bonafide metabolites per sample. We will then use these endogenous signals to develop targeted LC/MS methods that enable automated analysis of all detectable metabolites (i.e., the ?reference metabolome?). This will allow investigators with minimal expertise in metabolomics to profile the unique and bonafide metabolites in their samples at an untargeted scale, but without informatic barriers that have historically limited progress in the field.