DESCRIPTION: Many genetic studies are based on analyzing multiple DNA regions of cases and controls. Usually each is tested separately for association with disease. However some diseases may require interacting polymorphisms at several regions. Methods will be developed for determining combinations of polymorphisms that increase the risk of disease when DNA from cases and controls have been analyzed for polymorphisms at multiple regions. These methods will be used to determine combinations of polymorphisms of genetic fragments in the coding regions of linked HLA genes that increase the risk of Insulin Dependent Diabetes Mellitus (IDDM) Methods will also be developed for designing such studies and choosing sample sizes. Suppose that the DNA of cases and controls can be classified in terms of polymorphisms at multiple DNA regions. The problem is to find a smaller number of regions and corresponding polymorphisms so that the risk of disease is high when an individual has this combination of polymorphisms. This problem has three facets. 1. Finding a small number of DNA regions and polymorphisms that optimally predict disease status. 2. Expressing the uncertainty of this determination. 3. Incorporating samples where not every region is analyzed. A modern, computer intensive, statistical technique, the Data Augmentation Algorithm will be used to incorporate data from individuals who have not had every region analyzed and to simultaneously quantify our uncertainty about the true probability of each combination of polymorphisms among normal and diseased individuals in the population that we sampled. The algorithm produces multiple samples of the probabilities that a diseased and a no-diseased individual have each possible combination of polymorphisms. Each is a sample from the posterior distribution, i.e., each sample is an equally likely value of the set of population probabilities. The variation from sample to sample expresses the uncertainty due to the sample size and the fact that some data was missing. For each sample we select all the combinations of two or three polymorphisms that are good at predicting disease. The frequency with which a combination is selected estimates the posterior probability that the combination is a good predictor.