Methods for Genetic Epidemiology We studied the robustness of the case-only design for detecting gene-environment (G-E) interactions and found that estimates can be seriously distorted if the environmental exposure and gene are associated in the general population. We analyzed conditions required for hospital controls to yield unbiased estimates of G-E interactions in hospital-based case-control studies. Ideally, the control diseases should not be influenced either by E or G, and there should be mre than one control group. We studied the strengths and weaknesses of the kin-cohort design for estimating the penetrance of an autosomal dominant gene, and we developed marginal methods of analysis for kin-cohort data that are robust to residual familial correlations. These methods and full maximum likelihood procedures were developed for producing monotone estimates of cumulative risk in subjects with and without a dominant mutation. We also developed bivariate cure models to study survival data from pairs of members of randomly selected families. We developed methods to evaluate risks from environmental factors in families selected for genetic studies to have two or more diseased members. These methods, which are based on random effects models, take ascertainment and genetic correlations into account and avoid biases from conventional analyses that ignore these features. Three projects were completed to increase the power of concordant or discordant sib pair designs for detecting genetic linkage. One test weighs more severely discordant pairs more heavily. A second project studied optimal designs. Another procedure maximizes the minimum power over a set of tests designed to detect different plausible alternatives. We studied sample size requirements for family based association studies comparing affected with unaffected sibs. We also developed robust procedures to assure good power over a range of inheritance models for association studies based on the transmission disequilibrium test. We completed work on statistical methods for analyzing pooled DNA samples. Previous work has shown this approach to be efficient, compared to unpooled designs, for estimating prevalence and identifying individuals with a particular rare allele. The present work extends these methods to the estimation of the joint prevalence of two or more alleles. Joint prevalences have application to estimating risks from joint exposures and to estimating the population disequilibrium coefficient. We developed statistical techniques for discriminating segments of DNA with mutations from normal segments using data from denaturing high pressure liquid chromatography. Methods for Design and Analysis of Case-Control and Cohort Studies We described procedures for estimating variances for relative risk estimates from the case-cohort design and proposed adaptations to handle missing covariates. We began related work on methods of inference for absolute risk and for attributable risk estimated from such studies. Exposure Assessment, Errors in Exposure Measurements, and Missing Exposure Data We developed a method based on splines to estimate the contribution to current cancer risk of various portions of the previous exposure history. This technique was used to extend bilinear weighting methods to analyze lung cancer data from the Colorado Uranium Plateau Miners Study. The excess relative risk from exposure reached a maximum 14 years before the subject's current age. A detailed investigation of the impact of measurement error of the dose of head and neck irradiation on estimated risks of thyroid cancer took into account Berkson-type errors from the use of phantom "external prediction data", classical error from the use of regressions to predict dose, and missing data. This analysis changed estimates of the role of age at irradiation somewhat, but there was little change in the estimate of excess relative risk per unit dose of radiation. Assay Sensitivity and Specificity and Marker Prevalence We showed that a procedure of Hui and Walter to estimate prevalences, sensitivity, and specificity, in the absence of a gold standard, from repeated measurements in two populations with differing prevalences, is robust to violations of the assumption of common error rates. We used mixture models to estimate the prevalence of conditions for which there is no gold standard. These ideas have been applied, for example, to estimated the prevalences of various strains of SENV virus. Other Work We reviewed meta-analytic methods to analyze data on surrogate markers to estimate the effect of treatment on a true clinical endpoint and proposed a research plan to validate surrogate endpoints. One investigator developed a suite of MATLAB programs that facilitate the use of this language for sophisticated statistical and epidemiological analyses. We calculated the information content of various ranges of order statistics for estimating location and scale parameters. We developed methods for adjusting standardized mortality ratios for confounders and made these available as computer code in the program Epicure.