The primary long term goal of this project is the development or more efficient statistical designs and more efficient and flexible statistical methods of analysis for both analytic and descriptive studies in cancer epidemiology. A secondary goal is the rapid dissemination of research results in a form suitable for assimilation and use by epidemiologists who may lack advanced technical training. Epidemiologic studies play a major role in the identification of carcinogenic agents and in the quantification of the dose-time-response relationships upon which regulation and prevention strategies are based. Epidemiology as a science depends critically upon statistical concepts of design and methods of statistical analysis which it is the goal of this project to improve. The specific aims are to develop efficient and computationally feasible statistical methods for 3 problems: (i) estimation of random effects and variance components in generalized linear mixed models; (ii) analysis of data from two-stage case-control studies and other stratified epidemiologic designs; and (iii) analysis of epidemiologic data with missing covariable or exposure information. Random effects of mixed models for epidemiologic data, especially those that come in the form of counts, proportions or ordinal responses, are of increasing importance for: (i) incorporation of historical control information in case-control studies; (ii) accounting for inter-institutional variation in multi- center studies; (iii) recovery of "interstratum information" in finely stratified analyses; (iv) smoothing of cancer incidence rates for construction of disease maps; (v) exploratory regression analyses with smoothing based on auto-regressive models; (vi) combination of relative risk estimates from independent studies (meta-analyses); and (vii) clinical epidemiological prediction of individual responses to therapeutic or preventive interventions. Two-stage case-control studies and other similar stratified designs are of great value in limiting the collection of costly covariable data to the subsets of cancer patients and controls who are most informative regarding the association between cancer and specific exposures. Such designs and appropriate efficient methods of analysis have as a goal the collection of precise scientific data at minimal possible cost. Finally, missing covariable and exposure information is a pervasive problem in cancer epidemiology. Standard approaches based on "complete case" analyses may be biased and inefficient. Recent work on multiple imputation techniques, first proposed and used for sample surveys, promises to improve the analysis of epidemiologic data provided that it can be adapted to account for generally smaller sample sizes. The methods used to achieve these goals will include mathematical and statistical analysis, computer simulation and application of newly developed methods to important datasets collected by cancer epidemiologists.