Recent advance of genomic sciences has significantly changed the landscape of environmental health science research. Collection of high throughput genomic data has become increasingly important for investigating the interplay of genes and environment in causing human diseases in environmental case-control and cohort studies. Analysis of such high-dimensional gene-environmental data presents substantial statistical and computational challenges, especially in investigating gene and environment interactions. Limited statistical developments have been made in this area so far. This methodological shortage has become a bottleneck for effectively studying the roles of genes and their interactions with environment in causing human diseases. The purpose of this proposal responds to this need by developing advanced semi-parametric statistical methods to analyze high throughput data from gene and environment studies. We plan (1) to develop semi-parametric locally efficient methods for double-robust estimation in a case-control study, of a model for the joint effect of a genetic factor, an environmental exposure and multiple extraneous confounding factors, (2) to develop semi-parametric methods for multiple robust estimation in cohort and case-control studies, of a model of interaction between a genetic factor and an environmental exposure in the effect that they produce on a binary disease outcome, (3) to develop semi-parametric methods for double robust inferences of genetic effects incorporating gene-environment interaction and confounding adjustment in a Cox proportional hazards model for censored survival data and (4) develop efficient and open access user-friendly algorithms and statistical software that implement these methods with the goal of disseminating them freely to the gene-environment research community. In addition, we will evaluate the performance of our methods in three ongoing GWAS we have been involved with as well as in simulation studies. PUBLIC HEALTH RELEVANCE: The proposed project will develop cutting edge methods for discovery of novel genes and gene-environment interaction while efficiently incorporating prior knowledge. The impact of these methods to the field of public health promises to be significant through the development of improved methodology for robust investigation of the interplay of genes and environment in causing human diseases in environmental case-control and cohort studies.