With the aim of understanding the evolutionary history of genes and organisms and the mechanisms of evolution, two mutually related research projects are proposed. (l) Statistical methods for phylogenetic inference. Phylogenetic analysis of DNA sequences has become an important tool for studying population genetics and evolution. In this analysis the neighbor- joining, maximum parsimony (MP), and maximum likelihood (ML) methods are commonly used. However, the MP and ML methods are very time-consuming when the number of sequences used is large. In the proposed research, we plan to explore fast MP and ML algorithms that give a reliable bootstrap consensus tree (close to the true tree-rather than to the optimal tree) by using computer simulation. Efficient algorithms for constructing linearized trees with a reliable timescale will also be developed. In addition, a computer program package for studying molecular evolution will be developed. (2) Statistical analysis of genome diversity. Using abundant DNA sequence data recently generated by the genome projects, we plan to study the evolution and maintenance of genome diversity in different groups of organisms. We are particularly interested in understanding the evolution and maintenance of genetic variability of the major histocompatibility complex molecules, immunoglobulins, and T-cell receptors, all of which are concerned with generation of antibody diversity. We plan to conduct phylogenetic analyses of these multigene families using sequence data from diverse groups of vertebrate species. Special attention will be given to the coevolution of the variable region genes of immunoglobulins and T-cell receptors. We are also interested in extending our phylogenetic analysis to several multigene families that produce highly conserved proteins (e.g., ubiquitin, histone, and globins) to understand whether the highly conserved proteins are due to concerted evolution or to purifying selection or combinations of several factors.