With the growth of biological information, the efficiency of database retrieval has become central to the biological enterprise. In particular, one can change a retrieval method and must be able to evaluate whether the change is an improvement or not. We are developing methods based on the statistical bootstrap to assign statistical significance to improvements in database retrieval. The methods are based on mathematical central limit theorems describing the behavior of the receiver operating characteristic curve n under bootstrapping. We now have theorems relating the curve to U-statistics, providing a ready mathematical framework for developing the central limit threorems the theory requires. In particular, our methodology has already been applied to determine which changes to the PSI-BLAST program actually constitute improvements. In addition, we are investigating "isotonicity" of relevance in retrieval, the assumption that after rankwise averaging of relevance, records are retrieved on average in decreasing order of relevance. The isotonic assumption affects the evaluation of retrieval efficiency, and preliminary results indicate that despite its widespread adoption, the assumption can be wrong. We are also exploring the possibility of placing metrics on retrieval methods, to determine how closely related two retrieval methods are. The metrics could distinguish, e.g., a "tweak" on an accepted retrieval algorithm (which produces retrieval "close" to the algorithm's) from a truly novel algorithm (which produces a "distant" retrieval).