The fundamental premise of this proposal is that cancer types based on anatomic site may contain sub-types that are etiologically distinct. Indeed a lot of evidence for this has emerged in recent years. The goal of the proposal is to develop a strategy for optimally identifying such etiologically distinct tumor sub-types, and to develop the statistical techniques needed to accomplish this. In addition to clarifying cancer etiology, such an approach offers the promise of a more powerful strategy for detecting new risk factors, by focusing studies to discover these new risk factors on the sub-types that possess distinct etiology. Our research plan is motivated by a crucial new result regarding the occurrence of double primary malignancies. We show that the odds ratio linking tumor sub-types of pairs of independently occurring cancers is directly related to the underlying population risk heterogeneity of the sub-types. Consequently data from studies of double primaries can be used to determine optimal tumor sub-classification from an etiologic perspective. In this proposal we build upon this result to develop multivariate clustering techniques that optimize the etiologic heterogeneity of the resulting clusters (Aim 1). We will develop analogous techniques for creating sub-types that maximize the degree of etiologic heterogeneity on the basis of known risk factors for use in settings where data on multiple primary cancers are unavailable or unobtainable (Aim 2). We will determine the implications of the use of sub-typing as a strategy for detecting new risk factors from the perspective of statistical power (Aim 3). Finally, we will develop freely-available software to allow other investigators easy access to the methods that we develop (Aim 4). The research will lead ultimately to a conceptual framework for investigating etiologic heterogeneity, and a suite of statistical tools for conducting the dat analyses.