General epidemiologic methodologic work has progressed in four areas: (1) extension of the resampling methodology to within-cluster paired resampling (WCPR), which permits within-cluster inference for clustered binary outcomes; (2) theoretical development of design and analysis based on outcome-dependent sampling when the outcome is continuous, such as IQ or blood pressure; (3) development of semi-parametric modeling methods for combining information from previously diagnosed cases with that for occult cases who are detected by screening; (4) missing data methods were applied to the setting where molecular markers are used to subclassify cases into subsyndromes, but the cases cannot be fully classified. (1) We had developed a method based on sampling one outcome per cluster, carrying out a classical analysis on these now-independent outcomes, and then repeating the resampling many times, ultimately pooling the parameter estimates. "Within Cluster Resampling" (WCR) performs well in simulations, is analytically tractable, and obviates untestable assumptions about the underlying covariance structure. We have now modified the method to permit subject-specific inference. The idea with WCPR is to sample an affected and an unaffected individual from each cluster and compare them with regard to covariate differences by fitting a paired-data logistic regression model. The paired resampling is repeated numerous times and the separate estimates pooled, as in WCR. In simulations, the method compares very favorably with conditional logistic regression (CLR) in scenarios where the assumptions required by CLR are met. But when the response to an exposure varies across clusters, the dependency structure becomes complex and the assumptions required for CLR are violated, whereas WCPR remains valid. Thus, WCPR is more broadly applicable than the standard method. WCPR may prove most useful in genetic studies where affected and unaffected siblings are to be compared with respect to an allele that may be in linkage disequilibrium with a disease gene. (2) When studying a continuous biomarker of health, such as blood pressure, we have shown that one can markedly improve the efficiency of a study (over what would be achieved with random sampling) by our proposed design, which oversamples observations at the extremes of the outcome distribution, i.e. people with unusually high or low values of the outcome, provided the proposed semi-parametric empirical likelihood methods of analysis are then used. The strategy has been applied to studies of IQ and infant neurologic scores in relation to pesticide exposure. (3) A common condition, such as uterine fibroids, can be studied by recruiting women in ages when they are at risk and ascertaining prior diagnoses of the condition. If the condition is often subclinical, one can supplement case accrual by offering screening as well. We have developed a statistical method to model onset and progression of such conditions, allowing for an initial subclinical phase. (4) Molecular and genetic markers can be used to subtype cases into etiologically distinct subtypes, which may help to clarify inference in a case-control study. Unfortunately, tissue forcarrying out the classification may be incompletely available. We showed that an analysis based on discarding the cases who could not be classified often leads to bias, whereas in a scenario where missingness can depend on covariates, but not on the underlying subtype conditional on covariates, missing data methods can both prevent bias and improve precision of estimation, by exploiting the information from cases who were enrolled but could not be subtyped.