The Cancer Genome Atlas Project (TCGA) is generating large amounts of genetic and epigenetic information that promise to illuminate the biological mechanisms underlying cancer and to point the way towards more effective treatments. The latest evidence indicates that that the functions of hundreds of genes are altered in a typical tumor by under- and over-expression, point mutation, translocation, aberrant copy number and methylation patterns. At the same time, it is becoming increasingly clear that this complexity can be reduced through organizing data about cancer-related alterations by pathways - particularly signaling pathways. Pathway-centric analysis already plays an important role in studies of differential gene expression, somatic mutations, and copy number. In our view, the pathway approach represents the most promising methodology for both understanding the complexity of cancer-related alterations, and for rapidly translating multi-platform genomic investigations of cancer into tangible progress in prevention and treatment. This investigative team has extensive experience in cancer genomics in general and gene set analyses specifically. We recently developed tools for comparative enrichment analysis - the investigation of whether several related enrichment analyses identify common or different sets of genes;comparative network analysis - the investigation of whether the interconnections between members of a pathway are different in cancer and normal tissues;and we contributed actively to tool development for pathway-centric analyses associated with cancer genome sequencing projects. We propose here to extend these efforts by developing multidimensional tools that can leverage existing pathway information to integrate across different data types. Our work will produce a powerful analytic tool for extracting biological relevance from the multiple, high-dimensional datasets that will soon be available from the TCGA Pilot Project. The specific aims of this two year proposal are the following: AIM 1. ALGORITHMS. We will develop approaches and algorithms for pathway-based analysis of cancer- related alterations across all the platforms considered in the TCGA. AIM 2. VALIDATION. We will validate our algorithms by using existing TCGA data, both to perform analyses and to generate realistic synthetic datasets in which the pathways and genes that play a causal role in cancer are known. We will use these to systematically explore the performance of the alternative algorithms considered in a controlled environment. AIM 3. caBIG COMPLIANT IMPLEMENTATION. We will develop implementations suitable for embarrassingly parallel execution on supercomputer clusters. This approach will ensure that the algorithms are scalable to datasets involving hundreds of genomes. We will code our algorithms into caBIG silver-level compatible tools that leverage caBIG infrastructure, including the common data elements. Given the significant challenges of direct "vertical" integration of information generated by different genome-wide measurements, pathway-centric analytic methods represent the most promising venue for fast interpretation and clinical translation of TCGA data. If successful, the project will help accelerate development of personalized cancer treatment, by providing tools to identify which pathways are important for progression and prognosis in individual patients.