Neurodevelopmental disorders (NDDs) affect a considerable fraction of the population resulting in substantial economic and healthcare system burdens to society. Genetic aberrations, including copy number variations (CNVs) and sequence variants, are the most common etiology of NDDs. Understanding the mechanism of formation, developmental origin, and genomic architectural elements that predispose to such aberrations could improve clinical interpretation of diagnostics as well as genetic counseling of affected patients and families. Repetitive elements in the genome, and specifically those of the Alu family, have been found to mediate human CNVs, and we hypothesize that they could be under-recognized substrates for CNV formation. To explore this, we will computationally identify genes with significantly increased Alu content in introns and flanking genomic regions and compare this gene list to databases of human CNVs. De novo mutations, including those mediated by repetitive elements, that underlie NDDs are classically thought of as occurring in the germ line; however, mitotic cell divisions also represent a large target for mutagenic processes. Based on data obtained from families identified in our laboratory, we hypothesize that mutations during mitotic cell divisions in parents are a more frequent source of pathogenic alleles in their offspring than current detection suggests. We will prospectively screen a large cohort of family trios for unrecognized, low-level somatic mosaicism for genomic deletions to estimate the frequency. In recent years, genome-wide technologies have accelerated the identification of genomic variations underlying NDDs. However, clinical interpretation is made difficult due to the large number of variants detected in each individual. The classic method for determining detrimental alleles is based on incidence differences between patients and controls. Yet, because of recent human population expansion, most variation in an individual is rare and restricted among family lineages or clans, making distinction between rare and pathogenic variants challenging. We hypothesize that integration of multiple knowledge sources, including gene-specific, genome architecture, and population incidence data will result in more accurate, efficient interpretation of genome-wide diagnostics. To this end, we will identify potentially pathogenic alleles from both a large cohort of patients tested by chromosomal microarray as well as by performing exome sequencing of families with NDDs. We will then utilize bioinformatics and statistics to combine multiple information sources together to develop phenotype-specific pathogenicity probability for each variant. Such scores will also be used to investigate the extent to which the genetic load of individual variants do or do not contribute to human disease. Overall, this proposal aims to elucidate mechanisms of CNV formation, delineate the timing of mutagenic processes, and assess the deleteriousness of identified variants to improve interpretation of genomic diagnostics for patients with NDDs.