Project Summary Genes under natural selection may be related to heritable diseases, and variation in fitness more generally. For example, genetic variants related to differential mortality rates during pathogenic infections will be under natural selection when the infectious agents are present in the population. Inferences about selection at the genomic level in humans, therefore, provide a rich source of new testable hypotheses about functional relationships. However, while there are many methods for detecting natural selection at the genetic level, it is often very hard to determine exactly which genetic variants were targeted by selection. The aim of our study is to provide new computational methods for identifying causal mutations, and to apply these methods, in order to better understand the map between genotype and phenotype of loci that are, or have been, targeted by natural selection. We will apply the method to FADS genes, which harbor genetic variation associated with fatty acid metabolism and which have been under selection in European populations after the introduction of agriculture. We will test computational predictions experimentally in human cell lines modified using CRISPR/Cas9 technology. This will lead to a deeper understanding of the genetic differences among humans in these physiologically very important genes. In Aim 1 we will develop new computational methods that can infer, from DNA sequence data, which mutations have been targeted by natural selection. The methods will be able to incorporate the possibility that more than one mutation has been under selection and will also be able to leverage various forms of phenotypic and functional data. In Aim 2, we will test computational predictions regarding selection in the FADS genes using CRISPR/Cas9 in human cell lines. In addition to identifying the functional mutations, we will test hypotheses about interaction between mutations and between mutations and the environment, as represented by the distribution of fatty acids available to the cells in the substrate they are growing on. In Aim 3 we will extend the methods to be able to model selection in complex demographic models. We will also extend the method to be able to include environmental co-variates and ancient DNA. This will allow us to test hypotheses informed by the results of Aim 2 regarding the factors causing selection in the FADS genes.