We published data showing that case-control designs to detect interactions between environmental exposures and genetic factors require very large sample sizes, particularly when genes or exposures are rare or when exposures are subject to measurement error. We developed formulas to calculate sample sizes needed to detect gene-environment interactions and developed a computer program for distribution to epidemiologists to facilitate the design of such studies. We presented sample size calculations for the kin- cohort design for estimating the penetrance of an autosomal dominant gene, and we studied the strengths and weaknesses of this design. We developed a technique for kin-cohort studies to detect residual familial correlation among phenotypes after accounting for correlations due to an autosomal dominant gene. Work began to develop marginal methods of analysis for kin-cohort data that are robust to such residual familial correlations. We used the kin- cohort design to assess the effect of BRCA1/2 mutations on survival after breast cancer incidence. Another project developed methods to evaluate risks from environmental factors from samples of families selected for genetic studies. These methods take ascertainment and genetic correlations into account. An empirical study indicated that case-control assessments of the association of a candidate gene with disease status are unlikely to be confounded by population stratification, namely by association with a subpopulation that is also highly susceptible to the disease. Two projects were completed to increase the power of concordant or discordant sib pair designs for detecting genetic linkage. One test weighs more severely discordant pairs more heavily. Another procedure maximizes the minimum power over a set of tests designed to detect different plausible alternatives. We developed two methods for estimating cancer prevalence from cancer registry data and compared the efficiency and assumptions underlying the two approaches. We developed a statistic, the relative risk standard deviation(RRSD) to detect cancers whose mortality rates exhibited large geographic variability. The RRSD was found to be large for skin cancers in black men and women, and subsequent work indicated that exposure to ultraviolet radiation increases risk in black men. Tabulations of RRSD are included in a recent NCI Atlas of Cancer Mortality. We developed a sliding time window procedure to determine what portion of a time-varying exposure most influences current risk of disease. We also developed a methodbasedon splines to estimate the contribution to current cancer risk of various portions of the previous exposure history. In an application to case-control data on lung cancer in Germany, cigarette smokingwithin the previous two to ten years was most predictive of current lung cancer risk. We developed an efficient design to validate disease outcome status in post-marketing surveillance studies from large data bases to detect adverse effects of drugs. We developed pseudo-likelihood methods to analyze a population-based case-control study with controls obtained by cluster sampling. These methods will be used to estimate relative risk and absolute risk of non-melanoma skin cancer. We developed meta-analytic methods to analyze data on surrogate markers to estimate the effect of treatment on a true clinical endpoint. The procedure relies on previous similar studies with measurements of both the surrogate endpoints and true clinical endpoints to develop information on the relationships among treatment, surrogate outcomes and true clinical outcomes. We used mixture models to analyze the sensitivity and specificity of antibody tests for helicobacter pylori. Graphical method were developed to assess the importance of predictors in binary regression. One investigator developed a suite of MATLAB programs that facilitate the use of this language for sophisticated statistical analyses.