Whole genome sequencing (WGS) has the potential to improve medical care, but much effort remains to translate sequence data into meaningful clinical interpretations. WGS interpretation must address both newly observed genetic variants that are likely to be harmful, as well as the review of over 150,000 variants that are already reported to be associated with disease from the medical and scientific literature. Many of these discoveries were made in small cohort and case studies, making it difficult to translate these into disease risks for asymptomatic individuals that carry these variants. Without accurate risk estimates for these associations, we may potentially expose healthy patients to false positive findings, leading to needless diagnostic workups and screenings that will substantially increase medical costs and patient morbidity. Central to WGS interpretation is the development of a standardized methodology to filter likely benign results, and to prioritize those variants that may be clinically significant and scientifically valid. While many of these previously identified variats are associated with Mendelian disorders that are individually rare, (e.g. hypertrophic cardiomyopathy and neurofibromatosis,) these disorders are collectively common, forming a long tail that confers disease risk for many individuals. Because each of these diseases is so rare, it is hard to envision a specialized interpretive approach to calculate risk for each disease so we propose a systematic approach that is broadly applicable across many rare diseases to assess variant disease risk. To meet this urgent need, we will develop a novel approach that estimates the penetrance of disease- associated variants using the prior probability of each disease, and the population frequencies of all of the known genetic variants for that disease for affected and unaffected individuals. This prior probability of disease is measured as the prevalence, or the proportion of individuals in a population affected with a disorder. Because the prevalence of a Mendelian disease is actually a combination of the penetrance and frequency of all of its genetic variation (as well as other behavioral and environmental factors) we propose to estimate these penetrance values using the disease prevalence and distribution of associated variation, for each disease. If there is only one variant associated with a disease, the total penetrance and population frequency for that disease should be closely correlated with disease prevalence, but if there are many disease-associated variants, each contributes less to the overall burden of diseases, adjusted by its frequency in the population. We will then use these penetrance estimates to establish genome-wide filtering cutoffs for likely benign variation and to prioritize observed WGS variation for review by clinical geneticists. We then propose to use these values to filter and rank the observed variation in individual WGS datasets in an existing clinical trial, and to compare these with existing clinical genetics interpretations.