Project summary: Many initiatives have been launched to ensure that metabolomics data becomes publicly accessible. Despite the growing availability the data is not being reused. One of the main limitations of metabolomics data reuse and cross-comparisons is the lack of a unifying format and methods that enable comparison of multiple data sets, even collected on different instruments and methods as it is done with UniFrac for microbial sequencing. UniFrac is a distance relationship metric that takes in account phylogenetic relationships. Our goal with this project is threefold. 1) convert all public data into a unifying format. 2) subject all data with MS/MS information to living data in GNPS (http://gnps.ucsd.edu) where knowledge about the chemistry associated with the data is automatically updated and relayed to subscribers to the data. 3) create ChemiFrac, the Unifrac equivalent for metabolomics. Here we will use molecular networking as our phylogenetic relationship measure thus enabling global comparisons of data sets, that we expect will even work when different extractions and instruments are used.