Esophageal adenocarcinoma (EA) is an often lethal cancer and a significant public health concern in the U.S. Several risk factors have been identified for EA and its precursor, Barrett's esophagus (BE), including symptomatic gastroesophageal reflux disease (GERD), central adiposity, smoking, male sex, and European ancestry, but our understanding of the molecular/genetic determinants that govern why most people with reflux never develop BE, and most with BE never progress to EA, remains limited. Inherited genetics represents another likely contributor to disease susceptibility, and recent genome-wide association studies (GWAS) within the Barrett's and Esophageal Adenocarcinoma Consortium (BEACON) have identified multiple novel germline genetic variants associated with altered risk of BE/EA. As observed for other cancers and complex diseases, however, the majority of estimated heritability for BE and EA appears unexplained. One plausible source for this ?missing heritability? lies in the joint effects of multiple loci acting in combination. We seek to identify novel gene-gene interactions (GG) in relation to risk of BE/EA, using a novel fusion of advanced analytical approaches aimed at overcoming technical limitations often encountered in the conduct of large-scale GG (e.g. computational complexity and low statistical power). In a discovery scan (Aim 1), we initially will use two complementary knowledge-guided filtering methods to reduce the dimensionality of the GG search space: first, we select ~500 genes with known or hypothesized relationships to BE/EA etiology and pathogenesis based on findings from GWAS, TCGA, and related studies; second, we select ~50,000 gene-gene pairs with highest-likelihood annotated biological relationships based on an integrated knowledge resource of pathways, processes, and interactions (Biofiler). Using existing germline genetic data on 2515 EA patients, 3295 BE patients, and 3207 controls from BEACON, we next will adopt a recently-described `hierarchical group-lasso' machine learning framework to select, in a joint model, ~1000 candidate pairwise SNP-SNP interactions most predictive of case status. These interactions are then evaluated using an independent validation dataset of 1609 EA cases, 1037 BE cases, and 3537 controls (Aim 2), using standard logistic regression models. In exploratory work, we will perform stratified analyses by sex or smoking history. The proposed studies are anticipated to expand our basic understanding of key biological pathways and mechanisms underlying the emergence of BE/EA; generate new leads and possible candidate targets for interventional strategies; and expand the foundation for future collaborative initiatives to integrate genetic, epidemiologic, and clinical data into robust disease models and novel predictive tools for BE/EA risk assessment and precision prevention.