We are seeking support to establish a Cancer Genome Characterization Center in Boston. The goal of the proposed effort is to analyze 3000 tumor samples over a three-year period of time and identify a set of genes that can be resequenced by the members of The Cancer Genome Atlas (TCGA) project. It is well established that regions of the cancer genome that are amplified or show loss of heterozygosity or deletion harbor genes that are important for tumor initiation and progression. We propose to identify such regions in the cancer genome by conducting array comparative genomic hybridization (aCGH). Based upon detailed comparisons of many different platforms we have chosen to use the high-density Agilent oligonucleotide arrays for our studies. We already have the ability to have a throughput of processing a thousand samples during the first year of the proposed grant. We plan to continually improve the processes and reduce the cost of obtaining these data during the remaining portion of the grant period. The regions of amplification or deletion harbor many genes. An optimal way to identify the critical gene within these intervals is through gene expression profiling of RNA from tumors. There are several platforms for expression profiling and one of the most informative methods is through the use of a platform called Serial Analysis of Gene Expression (SAGE). Although SAGE is powerful it involves construction of "Tag" libraries and sequencing these libraries. The cost of sequencing each of the libraries with conventional methodologies is prohibitive. We have developed and validated a method called "Polony" sequencing that is capable of providing highly accurate sequencing information at a small fraction of cost required for conventional electrophoresis or pyrosequencing based methods. We have already successfully used this method to obtain expression profiles from small amount of polyA RNA and we propose to utilize this method to generate data from 3000 tumor samples in the proposed grant period. During the first six months we will implement this method and will be able to achieve the goal of generating data at the rate of 1,000 tumors/year. We also propose to use powerful informatics tools that we have developed or implemented to integrate the aCGH and expression profiling data and extract a list of most interesting genes for resequencing. We have established an award winning IT infrastructure that will be deployed for LIMS, data storage, data retrieval, data analysis and interface with caBIG. Our proposed approach also has the ability to generate additional useful data for tumor and patient stratification.