A study assessed the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders (e.g. the propensity score or disease risk score) , instead of the confounders themselves, are used to analyze observational data. The study evaluated regression models for cohort data, case-control and matched case-control studies that are adjusted for (and matched on) summary scores and derive the asymptotic bias. A study evaluated omnibus goodness of fit test for linear mixed models (LMMs) by computing a quadratic form of the differences between the observed and expected values computed from the model within cells of a partition of the covariate space. It showed that under some mild conditions, the test statistic has an asymptotic chi-square distribution and derived analytic expressions for the power of the test statistic under a local alternative. A new method was developed for determining subtypes of disease that share common risk factors. This methodology was applied in International Lymphoma Epidemiology Consortium to show strong differences in the risk profiles for B-Cell and T-Cell lymphomas. A new method was also developed for estimating the proportion of disease heritability that can be attributed genetic causes of mediating risk factors, and used this methodology to show that approximately 24% of lung cancer and 7% of bladder cancer heritability can be attributed to the genetic determinants of smoking. A study developed a mixture model for estimating risk from screening data, separating risk of disease present at baseline from risk of onset of incident disease. Standard Kaplan-Meier estimates are biased in this situation. Another study developed a new framework for risk stratification called mean risk stratification (MRS), which is the average amount of extra disease that a diagnostic test reveals for a patient. Using MRS, it was shown that a big risk difference does not imply good risk stratification for tests that are rarely positive, that a large Youden's index (or AUC) does not imply good risk stratification if disease is too rare, and that the expected benefit of a diagnostic test is a function of the test solely through MRS. A report showed that measures of association for molecular studies that use material from tumors to detect infection in cases can over- or underestimate the relationship between infections and subsequent cancer risk. A statistical procedure has been proposed to improves the efficiency of the logistic regression model for a case-control study by utilizing auxiliary information on covariate-specific disease prevalence via a series of unbiased estimating equations. A method was developed for constrained maximum likelihood analysis for model calibration using external summary-level information from big-data sources.A number of studies statistical genetic and genomics were conducted A robust statistical procedure has been developed to identify genetic risk factors that have either a uniform effect for all disease subtypes or heterogeneous effects across different subtypes. A test for genetic association was proposed that can account for heterogeneity in genetic effects due to gene-environment interactions under alternative models. A study developed extensions of various methods for testing gene-environment interactions to account for imputed genotype data. A study developed method for estimation of effect-size distribution using summary-level results from GWAS. Application of the method to results from large GWAS of several diseases indicate highly polygenic architecture of complex traits involving thousands to tends of thousands of susceptibility SNPs. Several studies are ongoing to investigate how to improve performance of polygenic risk prediction models incorporating summary-level results from large GWAS and various types of prior information on effect-size-distribution. A likelihood-based test and a valid method for type-I error evaluation were developed for mutual exclusivity analysis in detection of cancer driver gene. The methods were developed and applied for analysis of data from the The Cancer Genome Atlas (TCGA) project leading to identification a number of novel driver genes.