The VAST computer program at NCBI currently performs its structure neighboring by setting up a graph representing matching secondary structure elements, searching for cliques within that graph, and then summing the scores the edges of the matching clique graph. The statistical significance of the final sum is then calculated to decide whether two structures are similar. A general theory of systematically approximating most powerful statistics for database searches was applied to this problem as a specific testing ground. Initial investigations seemed to show that the general theory improved VAST retrieval of matching structures. Deeper investigation showed however, that the general theory was about equal in statistical power to previous versions of the VAST program. Significantly, however, the previous statistics needed several tortuous ad hoc "fixes" where the new statistic needed none. The theory also suggested several interesting directions for improving the sensitivity of available statistics. One direction included incorporating gapping statistics, which are known to improve retrieval in sequence databases. We have developed the required theory in the context of the present VAST statistics. The resulting gapped statistics are presently being added to the VAST program to test their effect on retrieval. Another direction included incorporating hidden Markov models (HMMs) into VAST. Theoretical investigations into HMMs in the context of VAST are underway. J. F. Gibrat, T. Madej, J. L. Spouge, and S. H. Bryant "The vast protein structure comparison method" (1997) Biophysical J. 72 : MP298-MP298, Part 2