Work was completed this year on the sequence logo project, detailed in reports of previous years, and a paper describing this project was published. A new direction for this project was launched this year in collaboration with Dr. Andrew Neuwald of the Institute for Genome Sciences and Department of Biochemistry & Molecular Biology at the University of Maryland School of Medicine. The first aim of the work was the development of an improved program for the multiple alignment of large numbers of sequence. The strategy we employed has several central features: (i) It employs a top-down alignment strategy that first identifies regions shared by all the input sequences, and then realigns closely related subgroups. This is key to escaping suboptimal traps, in which a set S of closely related but misaligned sequences resists change, because when a sequence X from S is dealt with individually, the remaining misaligned sequences of S pull X back into misalignment; (ii) It uses a Bayesian statistical measure of alignment quality, based on the minimum description length principle and on Dirichlet mixture priors. This measure favors more biologically realistic alignments than does, for example, the ad hoc but widely used sum-of-the-pairs scoring system; (iii) It infers position-specific gap penalties that favor insertions or deletions (indels) within each sequence at alignment positions in which indels are invoked in other sequences. This favors the placement of insertions between conserved blocks, which can be understood as making up the proteins' structural core. When applied to large datasets, the program we have developed runs significantly faster, and produces on average more biologically accurate alignments than widely used programs that have been considered the state of the art. A paper describing this work has been submitted for publication. A second aim of this work is to extend the method described above to a multiple alignment model that is articulated to describe phenotypically diverged sequences distinctly in alignment positions statistically implicated as associated with their divergence. Preliminary research has begun in this direction.