Project Summary/Abstract Background: The generation of high-throughput methods in molecular biology and medical imaging has motivated the need to better understand multimodal cancer patient data. This motivation has led to the development of radiogenomics, the study of the connection between a tumor?s underlying molecular traits and its clinical and imaging phenotypes. Radiogenomic analyses provide an opportunity to identify ?imaging surrogates? for inferring gene expression. This is clinically applicable for patients with glioblastoma multiforme (GBM) who routinely undergo magnetic resonance (MR) imaging. However, current radiogenomic approaches make strong algorithmic assumptions, and results are potentially biased due to the many different ways that data is preprocessed and selected. The goals of this proposal are to (1) develop new radiogenomic models for identify imaging surrogates, and (2) systematically assess the robustness of radiogenomic findings. Aim 1: To evaluate the application of deep neural networks to discover radiogenomic associations. Aim 2: To compare resultant radiogenomic associations given different model inputs. Aim 3: To determine if the use of radiomic features increases specificity of radiogenomic associations. Methods: In Aim 1, deep neural networks will be trained on public GBM data from The Cancer Genome Atlas and Ivy Glioblastoma Atlas Project (gene expression), and The Cancer Imaging Archive (paired MR images). The model will take high-dimensional gene expression data and predict semantic imaging phenotypes derived from semi-automatically segmented regions-of-interest (ROIs) on MR images. To enable learning of deep networks, the gene expression profile of each patient will be parsed into numerous sparse vectors that encode molecular pathway information. After training, ?activation maxi- mization? will extract imaging surrogates from the deep neural network. In Aim 2, a set of common feature selection methods will be applied to the gene expression data. Subsequently, the altered gene data will be the input to deep network to test their ability to derive consistent imaging surrogates. In Aim 3, radiomic features such as histogram and textural features will be extracted from the same ROIs segmented in Aim 1. Instead of semantic imaging phenotypes as the target output, the deep networks will take gene expression data and predict radiomic features. The inferred genes in each imaging surrogate from Aims 1?3 will be annotated with gene set enrichment analysis (GSEA) using several major biological knowledge bases as reference gene sets. Enrichment scores will be normalized, assessed for significance using permutation testing, and corrected for multiple hypothesis testing using the Benjamini-Hochberg method. Long-term Objective: The development and systematic evaluation of radiogenomic methods will help characterize the extent of overlapping information across biological scales in multimodal cancer patient data.