This R03 application proposes development of a novel method of causal analysis for cancer research studies. A particular structural description is posed for data outcomes that fall into a 2 exponent n table, of which a gene micro-array is an example. Although this model captures causal pathways, the problem is that it is not obvious how to compute model probabilities from the structural parameters. The first aim of this project is to discover an efficient, scalable algorithm for doing this computation. This will make it possible to perform maximum likelihood inference for the latent causal factor parameters. The second aim is to produce programs that will perform this analysis. The third aim is to apply the method to two existing data sets, and to other data sets that will be acquired. The existing data sets include a unique cytogenetic series of chromosome abnormalities in ovarian, breast, and melanoma tumors, with follow-up data on mortality. The second is from a successful 5-a-Day project to increase intake of fruits and vegetables, with extensive data on personal and environmental characteristics. Both of these were NCI-funded studies. Part of the third aim will be to identify and acquire cancer epidemiology data sets in order to re-analyze them from the perspective of the causal latent factor model, using the software developed in the second aim. This project will potentially advance cancer epidemiology by developing and popularizing a model for statistical analysis that respects the structure of causal pathways that are thought to underlie many processes in carcinogenesis. If successful it would also provide a new method for data-mining huge data sets to search for latent causal factors.