Colorectal cancer (CRC) is the 2nd leading cause of cancer death nationally, and the 3rd most commonly diagnosed cancer among Veterans. To reduce cancer risk, small growths in the colon called polyps found at colonoscopy are routinely removed. Current guidelines recommend repeat colonoscopy in 3, 5, or 10 years based on select features of polyps removed. However, the current approach is not accurate for cancer risk prediction. Late colonoscopy (in 5 or 10 years) is often recommended for individuals who go on to develop cancer or high-risk polyps. Conversely, early colonoscopy (in 3 years) is often recommended for individuals who go on to develop only low-risk findings. The result is suboptimal cancer prevention. The overall goal of this project is to develop a new, more personalized and comprehensive strategy for assessing risk for new polyps and CRC after initial polyp removal, including patient factors (such as age), baseline polyp factors (such as number, size, location), and quality factors (such as average polyp detection rate of the doctor performing colonoscopy). To develop the strategy, national VA colonoscopy and medical record data will be accessed to identify Veterans who have had polyp removal and at least one follow up colonoscopy between 1999 and 2012. At least 30,000 Veterans are expected to meet these criteria. Next, computerized Natural Language Processing (NLP) techniques will be developed to extract risk and outcome data of interest from colonoscopy and pathology records. These innovative techniques are required because the most valuable information available for risk prediction is only available in free text format witin these clinical reports. The alternative approach to data extraction (manual review of each Veteran's medical chart) is impractical, and indeed it is for this reason that research in this are has previously not been possible on a large-scale. Application of these NLP techniques will allow creation of a large, representative dataset of all Veterans who have had colonoscopy with polyp removal. In the third part of this research, a statistical risk stratification strategy to prdict risk for polyps and CRC after initial polyp removal will be developed using this dataset. Performance of the new strategy will be compared to current guidelines for predicting risk for CRC and high-risk polyps after initial polyp removal. The project is significant because Veterans are at high risk for CRC, but strategies for managing cancer risk are suboptimal. The project is innovative because we will apply cutting edge NLP methods to make use of data that is representative of all Veterans who have had polyp removal within the VA, and develop risk prediction models that go beyond current guidelines by using more personalized risk measures. The research team's expertise and significant prior work specific to CRC and polyps, and the rigorous approach proposed, ensure that the project is feasible and will be successful. Ultimately, investment in this Merit Review has great potential to improve CRC prevention for Veterans, and beyond. CRITIQUE 1 1. Significance. NCCN guidelines take into account age (>50) and family history and only recommends a 10 year interval for patients without a FH and with no polyp identified or if hyperplastic polyps are identified. For patients with an adenomatous polyp removed the recommendation varies between 3 years and 5 years based on polyp number, size, and histology (villous or presence of high grade dysplasia). The assertion that colonoscopy is suboptimal for cancer prevention is in part true but primarily due to access issues and compliance with screening and less so due to surveillance intervals . The idea of a personalized interval is already in place and while a little more difficult in the veteran population, providers still make their recommendations based on individual factors in concert with national guidelines. Efforts to develop a computerized entry form to quantify and record needed information seem more important. 2. Approach It is unclear if Aim 1 will use path records to only evaluate adenomatous polyps and if so the recommendations only differ by 2 years. Accepting a low detection rate from a provider/site and/or using that as a risk factor to increase screening frequency seems like a work around and more direct quality interventions to monitor and raise detection rates seem more valuable. The primary endpoint consists of polyp size, polyp number and histology. The discrepancy between endoscopic visual size and pathologic size needs to worked out 3. Impact and Innovation. Given that the utilization of colon cancer screening is only around 50% nationally (2005 National Health Interview Survey) and with documented capacity issues at some VAs who have limited screening colonoscopy (fee basis) and prioritized therapeutic colonoscopy. Any study findings calling for increased screening may be unfulfilled. This study will evaluate veterans who have had at least two colonoscopies. 4. Investigator Qualifications, and Facilities and Resources. This is an experienced team. 5. Multiple PI Leadership Plan. N/A 6. Adequacy of Response to Previous Feedback Provided by HSR&D Regarding the Proposed Study. 7. Protection of Human Subjects from Research Risk. Adequate 8. Inclusion of Women and Minorities in Research. Adequate 9. Budget. Adequate 10. Overall Impression. 11. Key Strengths. 1. Colon cancer prevention is an important area. 2. Good collaborative team with extensive experience 12. Key Weaknesses. 1. This seems like a validation of NLP in a large cohort 2. Information on preventative medications such as aspirin or other dietary risks factors such as red meat consumption and BMI are not addressed