An important challenge in the field of molecular epidemiology is to develop methods for studying multiple gene and multiple environmental factors as interacting factors contributing to disease. The overall goal of this proposal is to develop novel approaches to this problem, using folate metabolism as an important candidate pathway in colorectal carcinogenesis and as a prototype for other candidate biochemical and metabolic pathways, and to apply them to several datasets on colorectal cancer and colorectal adenomas. We have the following specific aims: 1. Develop a modeling framework incorporating biological understanding about the structure of a complex pathway, illustrated by folate metabolism. This will involve extensions to our differential equations model for folate metabolism, using the predictions of this in silico model to derive prior covariates for our hierarchical modeling framework, and developing an integrated approach allowing for model uncertainty. These various methods will be illustrated and contrasted with purely exploratory methods using data from the Colon Cancer Family Registry (C-CFR) folate study, an adenoma case-control study, and a randomized adenoma prevention trial. 2. Extend this framework to exploit biomarker measurements of intermediate metabolite concentrations and enzyme activity rates on a subsample of subjects. We will develop analytical methods and explore optimal sampling schemes stratifying on various combinations of disease, exposure, and genotypes that are available on an entire study population. These approaches will be illustrated with already available data on homocysteine levels in the adenoma case-control, additional biomarkers that will be measured in a pending C-CFR grant application, and two longitudinal studies. 3. Organize biological knowledge for the folate pathway into a formal ontology. We will develop systematic methods for extracting prior covariates from an ontology for use in our hierarchical modeling framework, and will explore ways of incorporating information on evolution of pathways across species. 4. Extend this candidate pathway approach to the genome-wide scale. We will explore methods for inferring pathways from genome-wide data and for using genome-wide association data to inform pathway-based analyses. These methods will be applied to data from the ongoing C-CFR GWAS. The four example datasets we propose were chosen in part to illustrate a range of designs, including family-based, population-based case-control, longitudinal, and randomized trial. PUBLIC HEALTH RELEVANCE: The overall goal of this project is to develop statistical methods for modeling the effects of metabolic pathways involving multiple interacting genes and environmental factors on complex human traits, incorporating biomarker measurements and leveraging ontologies generated from external biological and genomic information. Our methods will initially be developed for candidate gene studies, and later extended to a genome-wide scale. They will be applied to observational and experimental studies of folate metabolism in relation to colorectal adenomas and colorectal cancer and are expected to yield improved models for predicting individual disease risks and insight into their underlying biological mechanisms.