Project Summary/Abstract The proposed project is in response to PAR-18-021, NCI Small Grants Program for Cancer Research (NCI Omnibus R03). We are motivated by the resources in the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO). We are primarily interested in developing and applying innovative statistical methods for missing cancer subtype data. Many diseases, such as colorectal cancer (CRC), are heterogeneous. Molecular characterization of tumors has provided evidence of multiple tumor subtypes that develop through activation of diverse neoplastic pathways. Important CRC tumor subtypes include microsatellite instability (MSI) status, somatic mutations in BRAF and KRAS, and CpG island methylator phenotype. For instance, MSI status is associated with survival outcomes and treatment response. However, some individuals may have unknown MSI status, and other tumor biomarkers. Regression analysis may encounter a challenge due to missing data in some individuals of the study cohort. Methodology for missing data is often required to address the issue on bias in effect estimation and ef?ciency. Speci?c aims of this proposal include: (i) To develop and apply methods to take into account missing cancer subtype data in multinomial logistic regression. (ii) To develop and apply methods to adjust for survival analysis among cancer cases in which multiple tumor biomarkers may be missing. The methods developed in the proposal are applicable to the GECCO and other studies where cancer subtype data may be unknown among some study individuals. The methods can be applied to other common study designs such as nested case-control designs and Cox regression with competing risks.