DESCRIPTION: This is a revised proposal to determine what evolutionary forces affect protein evolution. Four experiments are proposed; all are based on the collection of DNA sequence data from the closely related species Drosophila melanogaster and D. simulans, as well as the species pairs D. yakuba - D. teissieri and D. erecta - D. orena. Strictly neutral evolution is independent of population size, but the probability of fixation of slightly deleterious alleles is inversely proportional to population size and the probability of fixation of advantageous alleles is directly proportional to population size. Therefore, if the effective population sizes are known for different lineages, the patterns of substitutions in these lineages can be analyzed to determine whether substitutions are on average strictly neutral, slightly deleterious or advantageous. Dr. Kreitman argues that the species effective population size of D. melanogaster is smaller than that of D. simulans, because there is less codon bias in the former than the latter. Furthermore, genetic variation within species is reduced in genomic regions of low recombination compared to regions of high recombination. Two alternative explanations for these observations - loss of neutral variation linked to selected deleterious mutations (background selection), and loss of neutral genetic variation linked to positively selected advantageous mutations (selective sweeps) - both imply smaller effective population sizes in regions of reduced recombination. Given these a priori inferred population size contrasts, the following data will be collected to test whether evolutionary substitutions are primarily neutral (the null hypothesis), slightly deleterious or slightly adaptive. 1. Numbers of amino acid replacement substitutions well be determined for approximately 25 homologous genes in D. melanogaster, D. simulans and the outgroup, D. yakuba. Loci will be chosen that have been sequenced in D. melanogaster, that encode large proteins, that are in regions of high recombination and that are not in In(3R)a, a fixed inversion between D. simulans and D. melanogaster. The loci will be obtained from D. simulans and D. yakuba by PCR using primers designed from the melanogaster sequence and knowledge of conserved regions from other homologs; and sequenced. The number of replacement substitutions in D. melanogaster and D. simulans will be determined by comparing each to the outgroup species. To test for systematic lineage effects (e.g., high mutation rates in one of the species), substitutions in intron sequences (presumed to be neutral) will be determined. Rates of amino acid replacement in the coding regions of D. melanogaster and D. simulans, adjusted for lineage effects, will then be compared to determine if substitutions are primarily neutral (rates equal), slightly deleterious (higher rates in D. melanogaster), or slightly advantageous (higher rates in D. simulans). 2. Experiment 1 will be repeated for a sample of 25 genes in regions of low recombination in both species. Here the assumption is that the effective population size of both species in regions of low recombination are nearly equal, and the prediction is that there will be no lineage differences in substitution rates of these genes. 3. In(3R)a is a fixed inversion (84F-93F) between D. melanogaster and D. simulans. D. simulans has the ancestral gene order. Recombination on chromosome 3 is suppressed near the centromere (from 81-84), so genes near the breakpoints of this inversion will have changed recombinational environments. 12 proximal genes and 12 distal genes within the inversion will be sequenced, and rates of amino acid substitution of the same genes that have experienced different effective population sizes in the two species will be compared. Under the slightly deleterious model, rates of substitution of genes from the proximal breakpoint in D. melanogaster will be greater than the same genes in D. simulans, and rates of substitution of genes from the distal breakpoints in D. melanogaster will be less than the same genes in D. simulans. The opposite predictions would be true for advantageous substitutions. 4. The generality of the inferences from D. melanogaster and D. simulans comparisons of rates of synonymous substitutions, relative codon bias and rates of amino acid replacement will be tested using the species pairs D. yakuba - D. teissieri and D. erecta - D. orena. The same 25 high recombination region genes sequenced in the first experiment will be sequenced in the three additional species. Rates of synonymous and amino acid substitutions will be determined for these species pairs, and examined for consistency with the patterns observed for D. melanogaster and D. simulans.