DESCRIPTION: Dr. Masatoshi Nei requests five years of support to continue the study of the evolutionary process at the molecular level. This research will entail the development of statistical theory, analytical theory, numerical simulation, and methods of data analysis. Two major areas of research are proposed: the development of statistical methods for phylogenetic reconstruction and analysis of empirical data. In the first statistical study, Dr. Nei will refine the minimum evolution (ME) method of phylogenetic reconstruction. Justification for using minimum evolution as a criterion of reliability comes from earlier analytical results obtained by the Nei group which showed that the total tree length for the true topology is expected be shorter than for incorrect topologies. Dr. Nei proposes to incorporate the search for shorter trees into his neighbor-joining (NJ) method of phylogeny reconstruction. Alternatives to uncertain neighbor joinings will be preserved and this set of partial reconstructions used as the basis for subsequent clustering cycles. This process will generate multiple candidate trees, which derive from alternatives to ambiguous joinings. The most likely tree or a confidence set of trees will be determined on the basis of the ME criterion. Another index of confidence derives from estimates, obtained by bootstrap resampling, of the probability of positive length for interior branches. The Nei group has proposed a correction for the bias in the confidence estimates this approach is known to generate, and its accuracy will be examined. Maximum likelihood (ML) methods will be developed to improve phylogeny reconstruction from protein sequence data and to infer ancestral amino acid or nucleotide sequences. The process of amino acid substitution will be explored by constructing an empirically-determined substitution matrix for particular proteins and by estimating the parameters of the Goldman/Yang codon-based model. For each amino acid substitution matrix determined, the ML topology will be estimated using standard methods. The accuracy of trees estimated by the ML, parsimony, and ME methods will be tested using mitochondrial sequences from vertebrate taxa for which phylogenetic relationships are known. Dr. Nei also proposes to refine a numerically-based ML approach to inferring ancestral sequences. Ancestral protein sequences will be inferred on the basis of the empirically-determined amino acid substitution matrix to be developed. The parameters of a general model of nucleotide substitution will be estimated according to methods introduced by Dr. Yang, who has recently joined the Nei group, and used to estimate ancestral nucleotide sequences. Methods for phylogeny reconstruction from short tandem repeat (STR) data will be developed. Numerical simulations will be conducted in which the evolution of STR number is represented by a stepwise mutation model with single or multiple steps, with new reproductively isolated taxa arising at a specified rate. Nine measures of genetic distance will be computed from the simulated data and phylogenies reconstructed on the basis of those measures using the NJ and UPGMA methods. The distance measures will be compared with respect to time-linearity and the accuracy of the phylogenetic reconstructions based on their values. Software for the methods to be developed will be incorporated into the MEGA platform, a package of phylogenetic analysis programs developed in the preceding funding period by the Nei group. The second major phase of research will address the evolution of immune system genes. Nucleotide sequences derived from the immunoglobulin (Ig) light-chain genes and the T-cell receptor (Tcr) genes will be studied. The objectives include resolving phylogenetic relationships within these very large multigene families and inferring the predominant modes of evolution. Theoretical studies designed to address the nature of adaptive evolution of immune system genes will be explored using numerical simulation. Alternative hypotheses for the maintenance of four major groups of largely monomorphic Ig genes will be tested.