A system of C language programs has been developed for the purpose of finding the closely related documents in Medline. The system has a number of unique features: 1) It is highly modular so that alterations in the system are relatively simple to perform. 2) The system currently operates on Medline data in the ASN1 format but a change in the interface portion of the system would allow it to be applied to any large database consisting of discrete textual records. 3) The system is designed with a degree of security against loss of data due to operating system crashes or power outages. 4) All data processed by the system is stored in permanent form as inverted file structures, etc. These structures are updatable so that new data may be continually added to the system as it becomes available. 5) Documents are compared with each other using a Bayesian form of analysis and the statistics on which the relevance weighting of terms is based are derived from previous document comparisons. These statistics are updated with each new cycle of processing. 6) The probability that documents are related is computed by the system based on a scaling of the raw scores produced using a set of document pairs that have been judged for relatedness by human judges. This scale is recalculated each time term weights are updated and it is calculated differently for documents with as opposed to documents without abstracts. The batch system described is now being used as the source for an online retrieval system allows free text queries. This is coupled with the neighboring of the batch system and the boolean capabilities of the Entrez retrieval system. The result is a versatile general access facility for that part of medline relevant to molecular biology. Work is ongoing on this system.