An automatic approach to applying Bayesian methods in text retrieval has been developed. This is a form of relevance weighting of search terms but departs from the usual approach in two ways which complement each other. First, the usual approach involves the assignment of relevance weights to the search terms in a single query based on the documents that are and those that are not relevant to the query. This involves generally a small number of relevant documents and hence a statistical sample that is difficult to use in making any globally significant inferences about the value of the terms involved. We modify the usual approach by taking the average of the importance of a term over all the queries in which it occurs. We study the case when the set of queries is the set of documents so that the global term relevance weight is a well defined concept. Second, the usual approach is limited to the case when one has human judgments of the relevance of documents to queries. This has limited the use of the method to certain test sets where the relevance relation is known or to relevance feedback situations. Our approach is to replace the relevance relation by the relation of high scoring pairs of query and document using the vector cosine method of retrieval. Because the latter is an automatic method we are able to generate the required statistics in an automatic manner. While this latter approach will undoubtedly have more error than human relevance judgments the larger sample size involved in global weighting helps to offset this problem. Local weighting is introduced in an ad hoc manner and the resultant retrieval is found to be somewhat superior to vector cosine retrieval. There are two problems with the model just described. First it does not incorporate local term weighting in a natural Bayesian manner and second it does not provide a correction for document length. We have developed a new model based on cluster concepts that remedies these two problems while allowing the model to remain completely Bayesian. This performs at the same basic level as the one already described in which local weights are treated ad hoc. It does however allow one to see the actual log odds predictions of relevance. These exceed the observed log odds of relevance by 13:1 which gives an interesting perspective on term dependency.