The long-term goal of this project is to identify all of the somatic, non-synonymous mutations in the unique portion of at least 10 human AML genomes by performing genome-wide, comprehensive resequencing of a number of tumor samples from carefully selected AML patients; the frequency of these changes will be determined in an additional 187 cases of AML. Although the current high cost of whole genome resequencing is high, we anticipate that costs will continue to fall through the proposed research period. Hence, we will use the data and informatics platforms created by this study to analyze as many AML genomes as possible. We have chosen to initially study one of the most common subtypes of AML, FAB M1, which has no associated mutations that have been shown to be responsible for the initiation of this disease. Samples chosen for analysis have the following characteristics: 1. Adequate amounts of bone marrow and skin DMA are available to perform 10X whole genome sequencing coverage, if necessary (i.e. 5 ug of each sample, non-amplified). 2. > 70% myeloblasts in the bone marrow sample to assure enrichment of AML cells. 3. Two or fewer clonal cytogenetic abnormalities, and complete analysis of tumor and germline DMA samples on the 500K Affy SNP array and the 2.1 M long oligo CGH array platforms for copy number variants. 4. Typical expression signature of M1 AML samples after gene expression profiling analysis. 10 FAB M1 samples from our locally banked AML samples meet all criteria, and most have already had several genes resequenced in the initial funding period. Importantly, the detection of mutations in these samples by directed PCR and resequencing assures their quality for whole genome resequencing. Using material from these 10 cases, we propose the following Aims: Specific Aim 1: We will use the Solexa massively parallel sequencing platform to perform 30 sequencing runs (10X genomic coverage at 1Gb per run) on at least 10 FAB M1 AML samples, and matched skin DNA samples for the first two AML genomes. Additional AML samples will be analyzed as cost permits. Specific Aim 2: We will use the Solexa-based bioinformatics analysis pipeline and our own read mapping strategy to iteratively align the short Solexa reads onto the human reference genome sequence. Specific Aim 3: We will develop approaches to identify, validate (in Core D), and annotate all nonsynonymous somatic mutations in the AML genomes sequenced, following read mapping and identification of high quality discrepancies. The frequency of these mutations will be evaluated in 187 additional AML cases in Core D, and their relevance for pathogenesis and outcomes will be evaluated in other PPG projects.