The aim of this research is to bring the information in narrative medical records and specialized areas of the medical literature into a form that can be operated upon by computer programs to extract, code, compare and correlate different features of the textual information. Previous work showed that the specialized use of language in a medical subfield makes possible the definition of a table-like structure for housing the information in texts of the subfield. Text sentences can be mapped into these structures by procedures which have been automated in the case of patient records and are (thus far) manual in the case of medical literature. The procedures include automatic sentence parsing, sublanguage analysis and information formatting. The present proposal investigates two applications of these procedures in conjunction with on-going medical research in other institutions: (1) Automatic coding of the narrative portions of data collected for the American Rheumatism Association Uniform Database for Rheumatic Disease (ARAMIS); (2) Construction of a linguistically structured data base of selected literature on lipid metabolism to be used by the Laboratory of Theoretical Biology of the National Cancer Institute of NIH in connection with the development of a mathematical model of lipoprotein kinetics. Success in (1) will permit an automatic consistency check between check-list data and data in the physician's narrative, and should suggest new data items which can be obtained automatically from the processed narrative. Success in (2) will provide the user with a data base of reported experimental results with which to compare the predictions generated by the mathematical model, and will provide a model of a new kind of medical data base consisting of excerpts from the literature in which the informational relations among the content-words are formally represented.