The goal of this research is to identify novel susceptibility genes for lung cancer using a systematic genome- wide association-based approach. To achieve this goal, we are applying a two-stage design. This proposal builds upon an extensive resource of lung cancer cases and controls matched for smoking behavior, sex, and ethnicity who have been collected over many years at the U.T. M.D. Anderson Cancer Center, and on a recently approved proposal to the Center for Inherited Disease Research that is performing genotyping of 317,000 markers on 1200 Caucasian ever-smoking lung cancer cases and 1200 controls matched for ethnicity, age, smoking behavior, and sex. Our first aim is to analyze data from the completed first-stage of genotyping. For our second aim, we will complete the two-stage design, which is powered to detect associations with genotypic risk ratios of about 1.4 with a very stringent significance criterion of 1.7 x 10-7. We will genotype and analyze 3380 SNPs on 1200 lung cancer cases and 1200 controls. We will also analyze SNPs from 90 candidate genes identified through a systematic literature search as well as a limited number of SNPs in significant pathways. In our third aim we will genotype 400 never-smoking lung cancer cases and 400 controls for 946 SNP markers. To evaluate possible interethnic differences in the genetic control of lung cancer risk we will analyze 400 African-American lung cancer cases and 400 African- American controls (aim 4) for 946 SNP markers. In these 4 aims we are also including a panel of SNPs to estimate ancestry. In aim 5, we use machine learning approaches to develop models for characterizing gene-gene and gene-environment effects on lung cancer risk. Cases and controls have extensive epidemiological and family history data together with candidate gene and functional DNA repair measures. The end result of our analysis will substantively increase our understanding of the genetic etiology of lung cancer and gene-environment interactions that contribute to its causation. This is an experienced investigative team, using state of the art technology to perform a multistage genome-wide analysis on a large, uniquely well-characterized population of lung cancer cases and matched controls with rich functional data. We are collaborating with the International Lung Cancer Consortium so that it can further validate our findings for a variety of different populations, in future initiatives.