This project has focussed on the analysis of amino acid and nucleic acid sequence data as it pertains to molecular biology and molecular evolution. Continuing areas of interest include: i) The development of computational tools for molecular biologists. We have modified an earlier protein search algorithm with significant improvements in speed sensitivity. We are also developing a new general sequence similarity algorith which is mathematically rigorous but more efficientthan previous rigorous algorithms. ii) We have analyzed amino acid pair constraints in proteins, where the residues of a pair may be separated by up to 39 intervening residues. We find evidence for significant constraints on pairs which are shared by evolutionarily unrelated proteins. Some of these constraints are associated with secondary structure. Some other explanation is necessary for the constraints found in regions of random coil. iii) We are examining accessible surface area in proteins and its relation to the linear sequence of amino acids as well as the forces determining three dimensional structure. We find a constant relationship between molecular weight and accessible surface area for segments along the linear sequence of these proteins as well as a nonarbitrary division between the interior and exterior of a globular protein. iv) We are studying the "units of conservation" in proteins (i.e., the extent of influence of preservation of one residue on the preservation of an adjacent residue among evolutionarily related proteins) to better understand their functional units. We are constrasting the neighboring residues along the linear sequence with the neighboring residues in 3-dimensional space. Efforts have continued on informing biologists of the computational tools available for sequence analysis. A computer users group has been formed and a computer bulletin board has been initiated.