High-throughput sequencing (HTS) platforms are revolutionizing genomics and health research. The incredible throughput of new sequencing instruments has enabled sequencing of genomes, exomes, methylomes, and transcriptomes in both research and clinical settings. As the cost of DNA sequencing has plummeted, two important trends have become apparent. First, the cost of analysis, in terms of computing resources and personnel, will soon surpass the cost of data generation. This will increase the pressing demand for analytical algorithms that run faster, with fewer CPU/memory resources, while processing overgrowing data sets. Second, the advent of HTS technologies has put low-cost, high-throughput sequencing into the hands of small research labs and clinical investigators; groups that are not accustomed to dealing with this type and scale of data. These developments will undoubtedly yield an unprecedented number of new discoveries, clinical insights, and medical breakthroughs in the coming years, provided the outstanding issues of HTS data analysis (short read lengths, inherent errors, and sheer number of sequence reads) can be conclusively resolved. Until now, most HTS has taken place in large genome centers with teams of bioinformaticians and substantial computing infrastructures. There is an urgent need to make their analysis tools and next-generation pipelines available to the wider research community as easy to install and use packages. We have spent several years developing a computational framework and innovative tools for HTS data analysis, with a particular focus on the discovery and interpretation of genetic variants. Our goal in this proposal is to make these tools available to the wider community, both individually and as part of a complete informatics solution from alignment to detection to interpretation. The solution we describe is flexible and powerful enough to be adopted by experienced laboratories, while at the same time providing high quality, push-button analysis of sequence data for those with little bioinformatics expertise. The framework will run in the cloud or on a single CPU, enabling researchers, educators, and clinicians to speed the transition from sequencing technology adoption to biological knowledge and clinical application. PUBLIC HEALTH RELEVANCE: The promise of the personalized medicine will only be realized when each individual's genetic code can be read and analyzed in the clinical setting. Unfortunately, the associated technologies will generate massive amounts of data that are difficult to analyze and interpret. The software describe in this proposal will enable widespread and easy analysis and interpretation of genetic data, accelerating the overall understanding of genetic information and its application to human health.