Microsatellite sequences are abundant in the human genome and have mutation rates orders of magnitude higher than any other genomic sequences. As a result, microsatellites are frequently used as markers in forensics and population genetics. Importantly, microsatellites influence genome functions by being part of protein-coding regions or by regulating gene expression, and allele-length polymorphisms at microsatellites are implicated as genetic risk factors in several diseases. Because the full impact of microsatellite changes on genome function has yet to be elucidated, it is of utmost importance to gain knowledge about how microsatellite arise, mutate, and eventually cease to exist at individual loci in the human genome. The evolution of each microsatellite has been presented theoretically as a life cycle, with the stages of birth, active dynamic mutation activity, and death. However, the concept of the microsatellite life cycle has not been previously investigated in detail. The goal of this interdisciplinary proposal is to elucidate mechanisms defining microsatellite life cycle in the human genome. This will be accomplished by a combination of computational and biochemical approaches, and follows the NIH roadmap themes of Interdisciplinary research and Bioinformatics and computational biology. Specific Aim 1 is to determine the mechanisms of microsatellite birth. We will use biochemical experiments to determine the microsatellite threshold in terms of the minimal number of repeats (or length) required for dynamic mutations to occur. These thresholds will be determined for various motifs, and will be used in computational analyses to examine mechanisms and densities of new microsatellite births. The results of this aim will allow us for the first time to derive a regression model explaining variation in microsatellite birth densities across the genome. Specific Aim 2 will examine microsatellite interruption and death. Our preliminary studies demonstrate that microsatellite interruptions can be observed frequently in the human genome, and that DNA polymerases can directly produce such interruptions in vitro. This aim will use computational and biochemical techniques to measure the mutational consequences of interruptions and the extent to which they contribute to microsatellite death. Specific Aim 3 is to computationally determine the mechanisms contributing to variation in mature microsatellite mutation rates among and within individual human genomes, and to biochemically determine specific mechanisms contributed by intrinsic features. Overall, the results of this project will be of considerable significance for our understanding of the dynamics of genome evolution. Additionally, our research proposal has direct relevance to the issues of public health and clinical genetics. The new information gained by our research can be used to predict the probability of each microsatellite to undergo mutation or cease to exist, and the probability of any genomic region to bear a new microsatellite. This will have major importance for assessing an individual's disease risks, especially in the era when individual human genomes are being rapidly sequenced. PUBLIC HEALTH RELEVANCE: Repetitive DNA sequences, called microsatellites, are characteristic of primate genomes and are known to regulate gene expression, and mutations within microsatellite sequences are causally linked to the development of several human diseases. Our interdisciplinary project will elucidate the mechanisms whereby microsatellites arise, mutate, and disappear at distinct loci in individual human genomes. This research could have major consequences for predicting the risk of diseases caused by microsatellites.