A major goal in cancer biology is to understand the protein interactions involved in the molecular pathways that lead to initiation and progression of the disease. Advances in sequencing technologies are generating an overwhelming amount of data towards that goal, however these data lack specific functional context for the molecules involved in cancer. This missing functional content could be provided by a bioinformatics approach that is based on domain analysis. Protein domains are evolutionary conserved regions of the proteins and are considered to be the functional and structural units of the protein. Functional and structural genomic studies have provided extensive information about protein domains and their functional roles. How- ever, current methods to analyze the molecular bases of cancer do not systematically integrate protein domain studies into their methodologies, and that are mostly based on the study of individual genes or proteins. Functional studies at the gene level are unreliable due to gene multi-functionality. Instead, studies at the domain level account for modularity of proteins and discriminate between protein regions with different functionality, which is more accurate and informative. Here, we propose the first genome-wide study of the molecular disruptions associated with cancer that relies on protein domains, rather than on proteins, as the molecular units of the study. Our hypothesis is that the disruptions within a domain that is shared by similar proteins will have related functional effects in cancer. Thus, relevant functional information can be transferred among proteins with similar functional characteristics. First, we will develop domain-based methodologies for the functional classification and of molecular disruptions related to cancer, as well as for the prediction of relevance to disease. Second, we will develop a novel approach to support pharmacogenomic studies of cancer drugs. Finally, we will integrate these domain-based analyses into a molecular database annotated with cancer phenotype information, which we will make publicly available. We expect this research to lead to new hypotheses for the biological mechanisms involved in the disease and to provide new molecular bases for the classification of cancer. This proposal will provide the foundational research for the development of new therapeutic approaches, new drug targets, and efficient gene therapies to fight cancer. PUBLIC HEALTH RELEVANCE: A major goal in cancer biology is to understand how the interaction between different proteins produced by the body can cause the disease. New technology is offering a very detailed look into the DNA sequences involved with cancer, but these advances also provide an overwhelming amount of data that does not necessarily give a clear picture as to how specific molecules function in the progression of the disease. One way to clarify this picture is to analyze protein domains. A protein domain is a certain region of a protein that has remained structurally identical over the course of evolution. These domains are considered to be what actually determines a protein's structure and function. Currently, research on the molecular bases of cancer focuses on individual genes or proteins. However, study on the gene level is unreliable, because each gene has many different functions, which confuses the data. Additionally, by analyzing domains, we can focus on the particular regions of the protein with functions we are interested in, providing more accurate and informative results. Here, we propose the first human genome-wide study of the molecular bases of cancer that relies on protein domains as the core of the study. Our hypothesis is that, by observing how the structure of protein domains changes in cancerous cells, we can predict the effect of cancer on other proteins with similar protein domains. First, we will develop domain-based methods for classifying the molecular indicators of cancer and examine how these relate to the disease. Second, we will develop a novel approach to support studies of cancer drugs. Finally, we will use our domain-based analyses as a basis for a molecular database containing information on the physical characteristics of cancer, which we will make publicly available. We expect this research to lead to new hypotheses for the biological effects of the disease and proved new molecular bases for classifying cancer. This proposal will provide the foundational research for the development of new therapeutic approaches, drug targets, and efficient gene therapies to fight cancer.