It is now widely accepted that both environmental exposures and genetic factors contribute to most diseases. Cancer epidemiology studies often ascertain exposures and behavioral factors (referred to generally as exposure factors) through survey instruments (for example, questionnaires), and typically examine variants or blood levels of exposures or markers of exposures (biomarkers) in candidate genes or pathways to identify risk factors for disease or for disease precursors. The responses to questionnaire items may be correlated. Similarly, the biomarkers may be correlated due to common underlying biological function or linkage disequilibrium and may occur at a low frequency. These render data analysis a challenging task. Although analytic methods have been developed to address this challenge, there continues to be a need to investigate novel approaches for efficient analyses of exposure factors and biomarkers, and to develop optimal study designs for follow-up research. The long-term goals of this proposal are to address these needs through the following four specific aims. In Aim 1 we will examine how to conduct efficient analysis of correlated exposure factors and biomarkers. In Aim 2 we will examine how to evaluate the effects of routes of exposures and underlying biological characteristics in an efficient manner. The etiologic effects of the risk factors are likely to be complex due to a number of reasons, including gene-exposure interactions. Aim 3 will examine parsimonious analytic methods for evaluating interactions. In Aim 4 we will develop a novel study design for sampling an informative sub-population from an established cohort to evaluate additional risk factors, when evaluation of the full cohort is not feasible due to logistic or cost considerations. Our proposal is motivated by several collaborative studies. We will conduct simulations to investigate the properties of the methods proposed under each aim, and derive guidelines for their use in practical settings. We will conduct an empirical evaluation of the methods and guidelines using data from the collaborative studies. Our proposal has two major objectives. First, we intend to develop and improve research designs and methodologies for evaluating correlated risk factors in cancer epidemiology studies. Second, we intend to assess the practical relevance of these methods by conducting empirical investigations using data from various studies. We anticipate that the insights gained from these investigations will have a significant impact on our understanding of analysis and study designs for identifying risk factors of public health relevance in an efficient and parsimonious manner. PUBLIC HEALTH RELEVANCE: It is widely accepted that both environmental exposures and genetic factors contribute to most diseases. Cancer epidemiology studies often ascertain exposures and behavioral factors through survey instruments (for example, questionnaires), and examine susceptibility variants in candidate genes or blood levels of exposures or markers of exposures to identify risk factors for disease traits. The responses to questionnaire items may be correlated. Similarly, the genetic variants may exhibit correlation due to their common underlying biological function and may occur at a low frequency. These render data analysis a challenging task. The long-term goals of this proposal are to investigate novel methods for efficient analyses of correlated risk factors, and to develop optimal study designs for follow-up research.