Project Summary The research described in the current proposal is intended to provide computational tools and data resources that will both enhance the understanding of disease-related Single Nucleotide Polymorphisms (SNPs) and the protein-protein interaction (PPI) pathways they impact and will, in addition, provide new mechanistic insights regarding cancer-related signaling pathways. More generally, the research described offers fundamentally new approaches to the molecular-level understanding of human disease through two Specific Aims: 1) The structure-enabled annotation of disease-related SNPs; and 2) The molecular-level annotation of cancer protein-protein interactomes. Integral to both aims is the development of new computational tools of broad applicability. The proposed research strategy is based in large part on the PrePPI (Predicting Protein-Protein Interactions) pipeline which integrates structural and non-structural information using Bayesian statistics to predict the likelihood that two proteins interact ? either physically or indirectly. The PrePPI database of about 1.35 million predicted human PPIs has been shown to provide comparable accuracy to high-throughput experimental databases but is far larger in scale and scope. PrePPI relies heavily on three-dimensional structural information and is quite unique in this regard. Aim 1 focuses on the creation of a database in which all human SNPs are mapped to the protein structures and the models contained in PrePPI. PrePPI predicted PPIs contain information about interfacial residues and this allows the development of a predictive algorithm to determine whether a SNP disrupts a PPI. Different structural features regarding SNPs will provide the variables for this algorithm, and their contribution will be determined using a Bayesian approach which exploits a positive reference set containing disease- related SNPs and a negative set containing benign SNPs. Aim 2 focuses on the functional, structural, and molecular characterization of cancer pathways and the creation of interactomes for known oncogenes such as K-Ras. PrePPI will be combined with network-based algorithms to predict interaction partners of these oncogenes and the results will be tested with biophysical and cellular assays. In addition, protein family-specific versions of PrePPI will be developed so as to facilitate a more refined prediction of interaction partners. Finally, comprehensive interactomes will be constructed for the ~550 cancer-related proteins in the Cancer Gene Census maintained by the Catalog of Somatic Mutations in Cancer (COSMIC), and this information will be incorporated into the expanded PrePPI database. The integration of the structure-enabled annotation of disease-related SNPs with cancer interactomes is very much in keeping with the NIH Precision Medicine Initiative: Assigning functions to all SNPs, rather than just the most frequently occurring ones, is crucial to tailoring therapeutic treatments on an individual basis.