ABSTRACT Colon cancer (CC) is a clinically and molecularly heterogeneous disease. While the TCGA data has implicated numerous molecular aberrations in cancer etiology and mechanisms, a direct link between genomic events and patient outcomes is lacking. While the TNM (tumor, node, metastasis) staging system is widely utilized and provides prognostic information, CCs show considerable stage-independent variability in outcome indicating that more robust classifiers are needed for prognostic stratification. Prognostic information is critical to guide patient management and surveillance after cancer resection and can inform treatment selection. Using only gene expression data, we identified four consensus molecular subtypes (CMS) of CC with distinct prognoses. We hypothesize that inclusion of additional genomic features will enable more granular molecular subtyping by identifying additional molecular patterns. Toward this objective (Aim 1), we will utilize multi-omics data sets generated from two completed phase III adjuvant chemotherapy trials in CC (NCCTG N0147, NSAPB C-08). We will also develop a supervised prognostic model by integrating comprehensive molecular data with clinicopathological variables and outcome data (Aim 2). Our unique resource for supervised learning is the high-quality survival data from the clinical trial cohorts. We hypothesize that integration of genomic alterations within clinically relevant genes and gene expression levels with clinicopathological variables can improve the prediction of recurrence/survival compared to traditional TNM staging alone. We will include in a step-wise fashion in our training models selected genes and miRNA expression, somatic mutations, minor allele frequencies, somatic copy number alterations as well as CMS and clinical features, to optimize predictive performance. Given that immune and stromal infiltrating cells are well recognized as determinants of prognosis in CC, we propose to characterize tumor immune and stromal markers among distinct CC molecular subtypes and determine their contribution to prognosis (Aim 3). Specifically, we will characterize these transcriptomic markers computationally, and determine whether they can refine molecular subtypes and improve prognostic modeling. Our proposal represents the first comprehensive prediction of CC patient survival using features from both genomic and transcriptomic alterations that will be integrated with immune and stromal markers using state-of-the-art supervised learning approaches. The impact of this work is substantial in that it will identify determinants of recurrence at the molecular pathway level or in the tumor microenvironment, which will help prioritize targets for therapeutic intervention. Furthermore, the outcome of this grant is expected to have practice-changing implications that can further advance the field of precision oncology.