The overall objective of this research is to develop computer techniques for extracting information, rather than titles or citations, from the medical literature and from computerized medical data stores written in natural language (e.g. hospital records). The methods, which have been established in previous work of the principal investigator, do not depend on prior knowledge of the subject matter, but use procedures and computer programs based on formal linguistic analysis. The major tools are (1) a syntactic analysis program which is equipped with a comprehensive grammar of English; (2) a clustering program which operates on the output of the syntactic analyzer, and groups words into semantic classes based on the similarity of their distributions vis a vis other words in the syntactically analyzed sentences; (3) the method of sublanguage grammars, which uses (1) and (2) and a manual linguistic analysis to state information structures applicable to tests, in a particular subject matter area; (4) a formatting program which maps tests in the given subject matter area into the information structures developed in (3). While these tools had been developed previously, the present investigation is the first attempt to apply the technique as a whole (called "information formatting") to a corpus of medical narrative, and to have the technique evaluated for its health relevance by medical users. The major goal of year 2 of this investigation was to carry out the computer processing of the set of medical reports selected by our medical collaborator, Dr. Lyman. The preparatory manual work, principally the coding of the words in the documents for use in the English parsing program, and the definition of the information format for the medical documents, had been done in the first year of the project.