PROJECT SUMMARY In a typical untargeted metabolomic analysis by liquid chromatography-mass spectrometry (LC-MS), about 70% of the detected ions represent unknown analytes. While identification of the unknowns without putative IDs remains a significant challenge, we have the opportunity to identify more metabolites by improving the ability to prioritize multiple putative IDs assigned to the known-unknowns. This will be tremendously helpful in selecting promising metabolites for the subsequent experimental verification of the IDs. In this SBIR proposal, we seek to develop a probabilistic framework that assigns a priority score to each putative metabolite ID by combining information from multiple resources including compound databases, pathways, biochemical networks, and spectral libraries. In addition, the probabilistic model will use results from various computational tools to improve the annotation accuracy. For example, a tool that clusters the detected ions based on their measured mass and retention time (RT) values will be used to recognize isotopes and adducts. This step will enable the user to determine the monoisotopic mass of the ions prior to searching for putative IDs against mass-based databases, thereby reducing potential annotation errors caused by mass change due to isotopes and adducts. Putative IDs derived from multiple databases will be merged based on the IUPAC International Chemical Identifier (InChI) keys. The proposed probabilistic model will exploit the inter-dependent relationships between metabolites in biological organisms based on knowledge derived from pathways and biochemical networks to assign priority score to each putative metabolite IDs. If MS/MS data are available, the score for a putative ID will take into account how well the measured MS/MS matches against those in spectral libraries or fragment patterns predicted by in-silico spectral interpretation. We will assemble the algorithms and scripts developed in this project into a browser-friendly cloud-based tool, MetaboCraft. We will use Java scripts to implement MetaboCraft?s graphical user interface (GUI), which will allow users to import m/z, RT, and MS/MS data and to export prioritized putative IDs in their desired format. Furthermore, the GUI will provide users with interactive visualization of putative IDs, extracted ion chromatograms, isotopic patterns, and MS/MS data. The performance of MetaboCraft in metabolite annotation and its computational efficiency will be compared against other existing tools based on LC-MS/MS data from metabolomic studies that consist of ground-truth information. Successful implementation and validation of MetaboCraft will enable users to accurately identify putative metabolite IDs and assign priority scores by taking advantage of publicly available databases, pathways, and biochemical networks, spectral libraries, as well as various tools designed for isotope/adduct recognition, decomposition of isotopic patterns, and in-silico spectral interpretation. By assigning priority scores to putative metabolite IDs, MetaboCraft will contribute to addressing the major bottleneck in metabolomics - metabolite identification, thereby enhancing the contribution of metabolomics in studies such as biomarker discovery and systems biology research.