Hierarchical Modeling of Interactions in Genome-Wide and Pathway-Based Association Studies: The overarching goal of this grant is to investigate the use of hierarchical modeling in the study of gene-environment and gene-gene interactions in both genome-wide association studies and pathway-based candidate gene studies. We will apply our methods to data available from two large NIH-supported projects: the Colon CFR and Children's Health Study. Data from these association studies often follows a natural hierarchical structure with polymorphisms within gene regions, genes within sub-pathways, and sub-pathways within etiologic networks. By building a statistical model to reflect this natural hierarchy we aim to better account for the dependencies between factors and to better incorporate our knowledge of the underlying etiology. In this proposal, in addition to evaluating the statistical form and structure of such models we also aim to gauge the impact of various types of prior information and intermediate measurements on inference. For genome-wide association studies we will develop analytic approaches for the incorporation of GxE interactions that deal with the multiple testing problem and extend to potentially more efficient 2-phase study designs. These methods will be expanded to test for the interaction of an environmental factor with multiple-SNPs, as well. Specifically with pathway-based studies, we aim to explore the feasibility and performance of mechanistic models (e.g. kinetic models) and hierarchical regression models with model selection for genes and environmental factors within sub-pathways and across networks of pathways. We use prior knowledge in the form of ontologies or expert-based relational databases to help formulate priors for the data analysis. Furthermore, we will investigate various multistage sampling schemes and their interplay with potential genomics data, including whole-exon expression, whole-genome somatic mutations and potential biomarker measures. Finally, we will compare our methods to various data mining techniques to allow genes to act within multiple pathways. Overall, we aim to develop statistical techniques that make it feasible to detect which genes involved in disease and, importantly, in which environmental context they act. By identifying both genetic and environmental factors, we will make progress in understanding the underlying mechanism that leads to disease and potentially identify ways in which to both prevent and treat complex diseases. PUBLIC HEALTH RELEVANCE: Overall, we aim to develop statistical techniques that formally incorporate our biologic knowledge and make it feasible to detect which genes are involved in disease and, importantly, in which environmental context they act. By identifying both genetic and environmental factors, we will make progress in understanding the underlying mechanism that leads to disease and potentially identify ways in which to both prevent and treat complex diseases.