We have in the past investigated a number of aspects of the problem of retrieval testing. These have included approaches to testing that do not involve human judgments, development of a new measure for the purpose of testing, a study of the special problems that arise in very large databases where only relative testing is possible, and finally an examination of nonparametric methods. The study of nonparametric testing has shown the bootstrap methods to be convenient and robust methods in the area of document retrieval. One problem that the field has not given sufficient attention to is the problem of the lack of consistency between judges. Salton and co-workers published an argument suggesting that this is not a significant problem. However, we have found that their proof is based on faulty assumptions. We have shown that this argument can be given a relative interpretation in which it seems less objectionable. It is possible to investigate this question using bootstrapping methods and we propose to do this in the near future.