PROJECT SUMMARY Genomic structural variants (SV) involving deletions, duplications, insertions, inversions, and translocations of sequences are an abundant source of genetic variation. SVs have been linked to Mendelian diseases, as well as complex heritable diseases like schizophrenia, and cancer. However, recent comparisons of extremely contiguous genome assemblies of humans and model organism Drosophila melanogaster have revealed that common genotyping strategies relying on high throughput short reads miss 40-80% of SVs, including those affecting phenotypes. Thus, contribution of SVs towards diseases and phenotypic variation remain grossly underestimated. To accurately measure the contribution of SVs towards deleterious genetic variation and trait variation, we propose to create a comprehensive map of genomewide SVs via comparison of extremely contiguous genome assemblies. However, contiguous de novo assembly of human genomes with high coverage (>50X) noisy long reads remains prohibitively expensive. So I propose to analyze SVs in the 25-fold smaller genome of model organism D. melanogaster, which has contributed substantially to our understanding of the genetics of complex human diseases. The proposed research aims to study fitness effects of polymorphic SVs based on de novo genome assemblies of 50 genetically diverse D. melanogaster strains that are as complete and contiguous as the current D. melanogaster reference genome ? arguably the best metazoan genome assembly (Aim 1). I propose to use this comprehensive set of variants to infer the distribution of fitness effects of the SVs and to estimate the proportion of adaptive SVs, both of which are reliable proxies for the evolutionary and functional significance of SVs (Aim 1). Aim 1 will involve training in theory and cutting edge methods in molecular population genetics. Next, the proposed project will develop an experimental approach to determine the fitness effects of variants for which an organismal phenotype is unknown. As part of this, the proposed project will develop genome editing resources that will facilitate rapid transformation of one of our sequenced strains with SVs, so that fitness effects of candidate SVs from trait mapping studies can be examined (Aim 2). Training in Aim 2 includes development of CRISPR-Cas9 toolkit in a common genetic background to investigate the functional effects of SVs. Finally, using the toolkit developed in Aim 2, we propose to conduct high throughput fitness assays to evaluate the selective effects of SVs under specific selection conditions (Aim 3). The training portion of the proposed research will complement the applicant?s previous experience and position him for a successful research career. University of California Irvine and the Emerson and Long labs together have the resources and expertise to ensure the successful completion of the training phase of the grant.