During the past year, work on this project focused on five loosely related topics in biological sequence analysis: 1) Eric Nawrocki contributed new software tools for structural RNAs to our Prokaryotic Genome Annotation Pipeline (PGAP). A paper about this work was published in Nucleic Acids Research. 2) Eric Nawrocki completed the coding and release for version 1.2 of Infernal, his software tool for RNA alignments. 3) We developed algorithms and implemented software tools to annotate all genes and other features in virus sequences and tested those in collaboration with the group of J. Rodney Brister in our Center. Further work on these tools is ongoing. 4) We developed algorithms and implemented software tools to improve the identification of nucleotide sequences that are contaminated by cloning vectors. These tools are currently being applied to correct thousands of contaminated sequences stored in the non-redundant (nr) database of sequences used by researchers world-wide. In the future, we may adapt these tools to do prospective analysis of incoming, new sequences. 5) We developed algorithms and implemented a prototype software tool to recognize 16S rRNA sequences, which are often use as molecular barcodes to identify species from their DNA. So far, we used this tool to evaluate and correct some large new submissions of 16S rRNA sequences in an ad-hoc fashion. In the future, we plan to integrate the new method with other software used for sequence analysis.