Abstract Current estimates place the number of personal variants at approximately 4 million per genome. Given the rapid advances in genome sequencing technologies and the future democratization of human genome sequencing, small groups and even individual scientists will soon be performing their own human genome projects. We believe that the ability to automatically annotate the millions of variants that these projects will produce, to combine data from multiple projects, and to recover subsets of annotated variants for diverse downstream analyses will become a critical analysis bottleneck. Despite the need, there are no publically available tools that automate these procedures. In response to the NHGRI's RFA "Development and Application of Statistical and Computational Data Analysis Methods for DNA Sequence, Variation, GWAS, Genomic Function, Chemical Biology and Related Genomic Data Sets" we propose in this GO grant to develop a standalone software tool called VAAST-Variant Annotation, Analysis and Selection Tool. This system will fulfill NHGRI's need for a technology to assess data quality and call variants and will allow for analysis of data from all sequencing centers and will be useable for data from all sequencing platforms. We believe VAAST will fill a huge void in the software landscape by helping individual scientists to extract meaningful results from whole genome variant files. PUBLIC HEALTH RELEVANCE: It is now known that on average any two individual human genomes differ by approximately 4 million positions. These differences, called sequence variants, underlie the inherited physical differences between individuals, including their predisposition to develop certain diseases. This project proposes to develop a tool called VAAST- Variation Annotation, Analysis and Selection Tool. VAAST will help researchers sort through these millions of variants in their quest to identify which of them underlie different phenotypic traits of individuals and susceptibility to diseases.