We propose to develop a protein identification resource. It will contain an expert computer system for protein identification, which will incorporate and identification paradigm, suitable computer programs, organizational structures (including correlations and patterns in the information) and protein sequences, amino acid compositions, and ancillary biochemical and biological knowledge. We also propose to develop a system of programs to make predictions of medical significance based mainly on the Resource knowledge, including secondary structure, antigenic sites, recognition domains and cross-reactivity of antibodies, best nucleic acid sequence probes, and possible restriction enzyme cut sites of coding regions. Finally, we plan to develop a computer system using the knowledge base that will facilitate associative browsing, the development of scientific insight, and the rejection of false hypotheses. Collaborative research will involve two theoretical projects to quantitate the use of additional data, from amino acid composition and from predicted secondary structures, to improve the power of the identification system. The other two projects involve the examination of new kinds of experimental data to make identifications. A workshop on computer methods will be held in the first year to suggest new collaborative projects. We will continue the on-line public access to our protein sequence knowledge base. We will publish a Newsletter to familiarize users with the system. Our goal is to develop a system so easy to use that biochemists all over the worls will perform their own routine identifications using telephone networks. The great explosion in the accumulation of structural data bears witness that investigators, over 4,000 of them, think that the information is important in their many different fields including virology, immunology, pharmacology, oncology, genetics, genetic engineering, biochemistry, physiology, and pathology. Protein structures contain important information required for understanding the causes of disease and developing a rational approach to treatment. These data are essential in the design of cures based on information macromolecules, which can be specific to the individual or to the particular type of virus, cancer, autoimmune disease, on inborn error of metabolism.