Survival outcomes for lung cancer, the leading cause of cancer-related mortality in the United States, remain poor. Improving lung cancer survival requires a multi-pronged approach, including smoking cessation and better elucidation of gene-environment interactions in risk, identification of new promising drug targets, as well as identification of potential prognostic and predictive markers that optimize treatment for patients. Our molecular epidemiology group has investigated the role of candidate germline polymorphisms and survival in lung cancer since the 2002 initiation of this study, and we have made significant contributions to understanding genetic and other markers of NSCLC survival. In this competing renewal, we will employ high- density genome-wide genotyping and epidemiologic approaches to identify better prognostic and predictive genetic markers of NSCLC survival. The ultimate goal of identifying such markers is to find ways to select the best treatment course for each individual patient. While the candidate approach has the strength of being based in biologic hypotheses, there are limitations to such an approach. A new approach is to use genome wide scans that capitalize on advances in high-throughput technology and in completion of projects such as the HapMap project, to allow for assessment of the human genome. In this renewal, we will take advantage of the large collection of clinical, epidemiologic and over 3,000 biospecimens and clinical data from the parent project begun in 1992. For gene discovery, we will use the new Illumina 610 Quad platform for a genome-wide approach to genotyping of SNPs. Once we have identified high priority SNPs, we will seek to replicate these findings in our larger case-cohort, as well as in separate external validation sets from 4 collaborating centers. Although our primary endpoint will be overall survival (OS), we will also assess disease-free survival (DFS) and progression- free survival (PFS), where appropriate. In Phase 1, we will use the new Illumina 610 Quad (610,000 SNP's and 60,000 Copy Number Variants) among 1000 lung cancer cases in the parent study, to identify the most promising SNPs that show evidence of association with lung cancer survival. In Phase 2, the top 3,000 SNPs will be selected from this discovery phase for further validation/replication in the remainder of patients. Then, in Phase III, we will assess the top 100 SNP's (50 each for early and late stages) in 3 external case cohorts with a minimum of 5 year follow-up information: MD Anderson, U. of Toronto, Mayo Clinic. To extend our findings to another case-cohort of different ethnicity, and maximize the capture of relevant SNP's, we will perform Phase 2 (2 Golden Gate 1536 SNP arrays) and Phase 3 (top 100 SNP's;50 for early and 50 for late stage) in a Chinese case cohort in Nanjing, China. Finally, we will conduct exploratory functional assays to assess variants effects on gene expression in the final set of replicated candidates. This will be the largest and most complete genetic analysis of NSCLC to date, and will move the field significantly towards to goal of more effective, personalized therapy for NSCLC patients. PUBLIC HEALTH RELEVANCE: Survival outcomes for lung cancer, the leading cause of cancer-related mortality in the United States, remain poor, and genetic determinants of survival are not well-defined. The proposed molecular epidemiologic study will employ genome- wide scanning approaches, with multiple replicates, to identify better prognostic and predictive genetic markers of NSCLC survival. The ultimate goal of identifying such markers is to find ways to select the best treatment course for each individual