Abstract Acute myeloid leukemia (AML) accounts for half of all pediatric leukemia deaths and is the leading cause of leukemia-related death in adulthood. One reason for worse outcomes is the inability to properly assess for minimal residual disease (MRD) following therapy. Unlike ALL, AML presents with multiple subclonal populations without a singular clonal surface marker, and surface markers can change during therapy. The current gold standard for AML MRD is multi-parameter flow cytometry (MPFC), which is predictive of outcomes to frequencies of 0.001, yet 30% of MPFC-MRD-negative patients still relapse. Alternatively, every AML case harbors leukemia-specific mutations that could be markers of disease, except that next-generation sequencing has high error rate of ~1%. In this proposal, we will implement a novel, validated error-corrected sequencing (ECS) strategy, developed by the Druley lab in collaboration with Illumina, to improve MRD assessment of AML subclonal heterogeneity in 990 pediatric de novo AML cases from the Children's Oncology Group AAML1031 study. We hypothesize that using a highly sensitive sequencing method will improve identification of residual AML, provide important insights on subclonal heterogeneity in pediatric AML, improve understanding of the role of germline variability and gene function on relapses or refractory disease and facilitate personalized medicine. To interrogate this hypothesis, we propose the following aims: 1. Define subclonal heterogeneity at diagnosis and end of Induction 1 (EOI1) in 990 pediatric de novo AML patients (n=1890). By using the largest prospective study of pediatric AML that has ever been performed, we will perform ECS on 94 genes that are the most frequently mutated genes in pediatric and adult AML at diagnosis and EOI1 to identify patterns of mutation associated with relapsed disease, FAB subtypes or other cytogenetic features. 2. Correlate ECS-MRD with existing EOI1 MPFC-MRD for all participants in the COG AAML1031 study. A major question is whether the ?different from normal? cell population identified as residual disease by MPFC is actually the same population(s) identified by ECS. We will define residual disease by ECS and compare results to MPFC status (positive/negative), actual MPFC percentages (<0.001) and the clinical outcomes (relapse risk, disease-free survival and overall survival) of study participants. 3. Integrate germline variation and all subclonal mutations into mechanistic groups that are frequently mutated in pediatric AML and correlate with outcomes using unbiased machine learning algorithms. Preliminary data tells us that every patient will have multiple subclones at diagnosis and EOI1 as well as germline variants in AML-associated genes, which may be important for outcome. In this aim, we will take these mutations into account as well as MPFC, clinical features and cytogenetics for probabilistic risk assessment using unsupervised machine learning algorithms for improved outcome prognostication.