Our laboratory is studying two forms of natural genetic variation in humans: 1) small insertions and deletions (INDELs) in the size range of 1 bp to 10,000 bp, and 2) transposon insertions that also produce small INDELs in this size range. Although these forms of natural genetic variation are abundant in human genomes, they have received less attention than SNPs and larger forms of structural variation. However, it is becoming clear that small INDELs and transposon insertions frequently modify genes, and thus, are likely to have a major impact on human health. Therefore, there is a need to develop additional resources surrounding these abundant forms of natural genetic variation in humans. In Aim 1 of this competitive renewal we will conduct a broad validation study of the small INDELs that have been discovered by the research community. We will leverage our recently-developed INDEL genotyping technologies to examine a strategic sampling of 200,000 of the small INDELs that have been discovered. Our study will include small INDELs from all of the largest depositors of small INDELs in dbSNP, personal genome projects, and the 1000 genomes project. These studies will allow us to examine the quality of these important community resources and to examine the relative accuracies of the major discovery methods that have been used. In Aim 2, we will use the INDELs that have been discovered by the research community to generate new, gene-centered INDEL resources that will facilitate genetics studies in humans. We will focus on INDELs that have been discovered in RefSeq genes by the research community but have not yet been integrated into the imputation maps that are used by GWAS studies. Thus, by integrating novel gene-centered INDELs into reference imputation maps, we will diversify and expand these maps with INDELs that might otherwise go unstudied. We expect that our expanded maps will enhance efforts by GWAS studies to identify gene loci and variants that influence human health. In Aim 3 we will examine a major source of new INDEL variation in humans: human transposons. We will use novel "transposon-seq" technologies that were developed by our laboratory to determine how often new transposon insertions are produced in both normal and cancer genomes. In addition to generating useful tools and resources for the research community, these studies will facilitate efforts to identify genetic variants that influence human health. PUBLIC HEALTH RELEVANCE: We are studying two abundant classes of natural genetic variation in human genomes known as INDELs and transposon insertions. These classes of variation are likely to influence many aspects of human health, and are thus relevant to the broad mission of the NIH. Our project will facilitate efforts to understand how these forms of natural genetic variation affect human traits and diseases, including cancers.