PROJECT SUMMARY/ABSTRACT Lung cancer is the leading cause of cancer related death in both men and women in the United States. Currently, approximately 70% of lung cancer patients are diagnosed at advanced stages, and the 5-year survival rate of advanced stage lung cancer is very low, at only 16%. Investigators have been searching for effective screening modalities for the early detection of lung cancer so that patients can receive curative treatments at an early stage. When the National Lung Screening Trial (NLST) demonstrated the effectiveness of using low-dose computed tomography (LDCT) scan for lung cancer screening (LCS), researchers and physicians hope to save lives from lung cancer by screening high-risk population who aged 55 to 77 years and have a 30 pack years making history or former smokes who have quitted within the past 15 years. Since the release of the landmark NLST results, many medical associations published guidelines to recommend LDCT-based screening for individuals at high risk for lung cancer and the Centers for Medicare and Medicaid Services (CMS) also decided to cover the LCS for Medicare beneficiaries who are at high risk for lung cancer. While many efforts have been made to accelerate the dissemination the beneficial LCS, the concerns over the high false positive rates (96.4% of the positive results), invasive diagnostic procedures, postprocedural complications and health care costs may hinder the utilization of lung cancer screening. This concern was magnified as researchers and policy makers started questioning whether the complication rate and false positives in real-world settings would be even higher than the rates reported in the NLST, which was conducted in a setting with well-established facilities and proficiency in cancer care. Therefore, we propose to understand the contemporary use of lung cancer screening and associated health care outcomes and costs using data from a real-world setting. Our study has three goals: 1) to develop an innovative computable phenotype algorithm to identify high-risk and low-risk individuals for LCS from both structured and unstructured (i.e., clinical notes) electronic health record (EHR) data and to develop advanced natural language processing (NLP) methods to extract LCS related clinical information from clinical notes such as radiology reports; 2) to determine the appropriate and inappropriate use of LDCT among high-risk and low-risk individuals in Florida and to examine the test results of LDCT, the rates of invasive diagnostic procedures, postprocedural complications, and incidental findings in real-world settings; and 3) to develop and validate a microsimulation model of the clinical courses of LCS incorporating the real-world data in LCS to estimate the long-term benefits and the cost-effectiveness of LCS. Our proposed study has the potential to reduce lung cancer incidence and mortality by informing policymakers and practitioners on the appropriateness of contemporary use of LCS. This knowledge will help both patients and physicians better understand the harm- benefit tradeoff of lung cancer screening and transform such knowledge into practice to prevent avoidable postprocedural complications.