The overall goal of this project is to explore and identify regular linguistic patterns within subject terminology in a medical bibliographic database. It explores the fundamental problem in information retrieval of reconciling and controlling the diversity of linguistic expressions for the same and related concepts in documents and queries. It specifically relates to the Unified Medical Language System (UMLS) effort to identify, extract and relate variant medical expressions in machine-readable bio-medical information resources. The research applies methods from empirical linguistics to the descriptive analysis of patterns in a sample of descriptors and documents in Medline. The guiding principles of the method are that it produce sets of textual elements which are linguistically related, and that the identified patterns be amenable to computational identification, extraction and manipulation. Evaluation criteria include measures of both the number and robustness of the identified patterns. The results can be applied to the design of automatic or semi-automatic vocabularies and mapping mechanisms which will aid the user in identifying and selecting correct terminology for their search requests.