Project summary Diagnosis, choice of treatment, and prognosis of common and complex diseases, such as cardiovascular disease are still mostly based on a few biomarkers. While these biomarkers improve clinical practice at the population level, their sensitivity and specificity in particular subpopulation groups may be more limited. With the advent of affordable and highly reproducible omics measurements, there is a great opportunity to advance personalized medicine through the integrative analysis of much larger sets of markers. A particularly promising avenue is metabolomics, as it allows for the identification and quantification of hundreds of metabolites enabled by high performance technologies such as nuclear magnetic resonance (NMR) spectroscopy. Here, we propose to test the hypothesis that the aggregation of the NMR features that correlate with genotypic data and which can be matched to one or several metabolites could function as a novel type of quantitative biomarker. Such markers could be more powerful than individual spectral features or feature combinations that cannot be linked to known metabolites. The primary goals of the proposal are 1) to test whether these pseudo-compounds achieve stronger genetic association than any of their individual features and 2) to investigate whether such pseudo-compounds associate with established cardiovascular risk factors. For this purpose, we will leverage existing genotype, metabolomics, and other phenotype data, which has been measured for a subset of 983 individuals from the CoLaus (Cohorte Lausannoise) study. Then, we will validate our findings using a subset (n=500) of the JUPITER (Justification for the Use of Statins in Prevention: An Intervention Trial Evaluating Rosuvastatin) study. Regression analysis will be used to model the relationship between genotype and metabolomics data, as well as to model the relationship between pseudo-compounds and established risk factors. Our proposal is innovative because it is not limited to a predefined set of metabolites as in targeted metabolomics. At the same time, a great advantage of our proposed concept of pseudo-compounds is that often it may be feasible to match them to one or several metabolites with a known spectrum, thus providing a tangible entity that can be related to the chemical and biological pathways of these metabolites. We will make our method publicly available for non-experts as user-friendly software, allowing clinical researchers with a specific interest in a disease or a related trait to directly extract and test pseudo-compounds as potential biomarkers. This will facilitate the translation of untargeted metabolomic data into potentially clinically relevant biomarkers.