Heuristics have been developed that allow the prediction of the secondary structure of proteins starting from a set of aligned homologous protein sequences. These heuristics extract conformational information from patterns of conservation and variation within the family of proteins. The tools as presently implemented involve both automated and manually applied tools. They have been tested by making bona fide prediction of protein secondary structure, those announced before an experimental structure becomes available, for approximately a dozen proteins. These predictions have proven to be remarkably accurate. Further, they have defined a few secondary structural element types that are difficult to predict, thus focusing future work. Thus, they are a significant step towards meeting one goal of the National Library of Medicine to develop "algorithms capable of predicting structure [of proteins] based on primary sequences of amino acids" (Item 103.D). The work to be funded under this proposal will transfer from Switzerland to the United States these prediction technologies when the Principal Investigator moves to the University of Florida in 1995. The proposed work will test the feasibility of preparing computer software that can be used on workstations to automate those aspects of the structure prediction tools that are presently applied by hand. PROPOSED COMMERCIAL APPLICATION: Tools that enable the prediction of the folded structure of proteins from sequence data are commercially marketable as computer software, as well as core units in drug discovery and drug development programs.