PROJECT SUMMARY/ABSTRACT This proposal for the NIH Pathway to Independence Award (K99/R00) focuses on the training of Dr. PingHsun Hsieh to become an independent investigator of large-scale genomics and human population genetics. Dr. Hsieh is a population geneticist by training, and the proposed studies will advance his training into long-read- based sequencing technologies and novel machine-learning approaches to study the fitness consequences of new mutations, with a focus on structural variants (SVs), in humans and nonhuman primates. Another essential piece will be the development of resources on which types of new SVs are most likely to be pathogenic and hence most worth further effort by medical researchers. The methods developed in this work will enable other researchers to do more hypothesis-free analysis of SVs in disease etiology. Specifically, the training program will center on the study of the distribution of fitness effects of new SVs in human and nonhuman primates using high-quality SV calls and genotypes from several large-scale long- and short-read sequencing projects. The mentored work will take place under the supervision of the primary mentor, Dr. Evan Eichler, and the co-mentor, Dr. Sharon Browning, both at the University of Washington (UW). The mentor and co-mentor are well-established experts in the characterization of genomic variations using high-throughput technologies and the development of stochastic modeling methods for large-scale genetic data, respectively. Dr. Hsieh will also gain advice from a formal advisory committee as well as through activities arranged by the Department of Genome Sciences (GS), which is an optimal place for the mentored training providing the candidate with access to outstanding scientists in areas including genetics of model organisms, disease, population genetics, and the development of high-throughput genomic technologies. While found in nature and yet generally deemed to be deleterious given their size, SVs can be beneficial, and thus, the distribution of fitness effects (DFE) of new SVs (i.e., the relative frequencies of beneficial, neutral, and deleterious SVs) remains elusive. In the proposed studies, we will infer the DFE of new SVs and other variants to assess their relative importance in nature, which in turn helps prioritize variants (e.g., SVs vs. single- nucleotide variants [SNVs]) in medical genetics. Specifically, in the K99/R00 phases we will (1) infer the DFE of new SVs and SNVs using a diverse panel of ~100 long-read and ~4,000 short-read high-coverage human and nonhuman primate genomes; (2) compare the DFE of new mutations among primates using contemporary and ancient DNA genomes; and (3) study the fitness effects and selective constraints on diseases in different mutation categories in large cohorts of >20,000 genomes. The skills learned in this proposal are on the cutting-edge and are tailored for the candidate to amass a great amount of knowledge in new areas of genomics, which will be applicable to many organisms and diseases and critical to the candidate?s future independent laboratory.