Genome-wide Association Analysis of Breast Cancer Breast cancer remains the most common malignancy in females in the United States and is a major public health problem. Although progress has been made in prevention, early detection, and therapy, 37,500 women die of this malignancy annually. Analysis of breast cancer families resulted in the identification of two major loci, BRCA1 and BRCA2. These genes and several minor high-penetrant loci (PTEN, CHEK2, TP53) together account for 4560% of disease in multiplex families [9-11]. Therefore, there remain additional genes conferring risk for breast cancer. Using the Affymetrix 500K GeneChip, more than 435,000 SNPs were typed in 249 AJ breast cancer cases in which there was a family history of breast cancer, and 299 AJ controls. For a replication sample, the same markers were typed in a sample of 238 sporadic AJ breast cancer cases and a separate set of 187 controls. The 200 loci with the lowest P values in both data sets were examined, and many of these loci were found to represent SNPs with inadequate quality of their genotype calls. Therefore, a display program was generated to allow the manual examination of potentially significant loci. From this analysis, several loci were identified that have multiple SNPs in the gene or gene region with a highly significant association 2. Prostate Cancer: Validation of Exome Sequencing and Application To identify the optimum methods for whole genome/exome sequencing, we have sequenced the complete transcriptome and exome of four tumor cell lines: PC-3 (prostate);MCF-7 (breast);and SN12c and 786-0 (kidney). The transcriptome analysis was generated by converting mRNA into cDNA and sequencing clones on the Roche 454 instrument. The exome analysis was performed by fragmenting genomic DNA, capturing exons on the Nimblegen array, and sequencing on the Roche 454 instrument. From each of the four cell lines, approximately 1 million cDNA reads, with an average length of 350 bp, were generated for transcriptome analysis. This data were used to carry out a number of analyses, including the identification of gene variants and mutations, digital count of the expression level of genes, identification of hybrid transcripts, and the assembly of gene contigs to identify alternative splicing. In parallel, we generated 3 million reads of exome sequence from each cell line, with an average length of 370 bp, and used the data to detect high-confidence variants. From the success of the Roche/454 exome approach, we applied this technology to the sequencing of normal DNA and DNA from five different metastatic lesions in a single prostate cancer patient. More than 3 million reads were obtained on each of the normal and metastatic tumors resulting in greater than 25-fold coverage. Prediction of variants was carried out in each sample, and more than 500 variants that were present in three or more of the metastatic lesions were identified. 3. Kidney Tumor Exome and Transcriptome Sequencing To begin applying the technologies of high-throughput sequencing to renal cancer, we have investigated the use of the Illumina Solexa instrument for both transcriptome and exome sequencing. The Illumina exome approach uses an Agilent solution capture technology. DNA from two sporadic clear cell tumors were exome captured and sequenced to an average depth of 75-fold. From the sequence analysis, we identified a list of 101 genes with a newly described variant affecting the coding region, after eliminating genes from certain large, variable gene families, such as olfactory receptors. We further refined this list by analyzing the following characteristics: predicted severity of the mutation using Polyphen, SIFT, and other programs;presence of the same gene mutated in three out of five or more of the five genomes analyzed;presence of a mutation in that gene in the COSMIC database of cancer-related mutations;involvement of the gene in interactions with cancer-related proteins;and presence of loss of heterozygosity in the gene region in the tumors sequenced. From this combined analysis, we selected 16 genes as top priority for follow-up analysis. 4. Retinoblastoma in Latin America-Epidemiology and Genetic Analysis Retinoblastoma is one of the most common pediatric solid tumors in Mexico and Central America, accounting for up to 10% of all diagnosed cases, as opposed to 23% of all cases in the United States and Europe. Incidence calculations of retinoblastoma in Mexico have shown that the incidence varies within regions of the country and is highest in the Chiapas region bordering Guatemala. To further understand the factors involved in the higher prevalence, we performed an analysis of 246 consecutive cases treated over 8 years at the Unidad Nacional de Oncologia Pediatrica (UNOP), the sole pediatric cancer hospital in Guatemala. Data on age at diagnosis, birth region, laterality, ethnicity, and fathers occupation were captured, and this cohort was compared to a cohort of all cases with acute lymphocytic leukemia and as controls, children examined and found to be cancer-free. From this data we calculated the incidence of retinoblastoma to be 8.1 cases/million children under the age of 14 in the Guatemala City region. This incidence is elevated twofold over the incidence in the United States and Europe, and is similar in indigenous and admixed populations in the capital region. The elevated incidence is not due to an increase in familial cases, suggesting an environmental contributor. Analysis of retinoblastoma incidence in indigenous and admixed populations demonstrates a lower incidence in indigenous children in rural areas farther from the capitol. This disparity is even more pronounced in acute lymphocytic leukemia. Unilateral retinoblastoma accounted for 72% of cases. The average age of diagnosis and stage at diagnosis are advanced, resulting in reduced survival. To understand the spectrum of mutations in Mexican retinoblastoma cases, we mutation scanned and sequenced all 27 exons of the RB1 gene in 48 Mexican retinoblastomas. Overall, 21 (44%) cases were bilateral and 27 (56%) were unilateral. Thirteen different oncogenic mutations were detected in 14/48 (29%) of patients, nine of which were germline (64%). Six of these mutations are newly described (IVS3-1G greater than T, g.39518insGA, g.70274delT, 150038delG, g.160796InsT, and promoter -149G greater than T). Loss of heterozygosity of the RB1 gene, as assessed by intragenic markers, was 50% (18 of 36 informative cases), and was higher in tumors with known mutations (77% vs. 35%). This low mutation detection rate and the earlier age at diagnosis in unilateral retinoblastoma cases suggest that other RB1-inactivating mechanisms could be present in retinoblastoma development.