The research activities of the Molecular Modeling and Bioinformatics project can be grouped into four categories. (1) Study of forces that govern the stability of and interaction between protein molecules: We study hydrophobic effect and, more generally, solvent effect by statistical mechanical techniques. The knowledge and the insights gained from these studies are used to develop potential functions that can be used in ab initio protein folding, protein stability, protein engineering, and protein-protein and protein-ligand interaction studies. (2) Protein structure classification and fold recognition: We devised a new protein structure comparison procedure, which we use to classify protein structures. The classified protein structure database is used to gain an overview of the universe of protein structures and to study the architecture of protein structures. We also devised a new "threading" potential based on a pair-to-pair amino acid comparisons. We will use both the classified protein structure database and the new threading potential to parse a protein into domains, to classify these domains into different structural groups, and to develop a more powerful protein fold recognition tool. (3) Immunotoxin engineering: Immunotoxins are hybrid molecules made by connecting an antibody Fv domain and a bacterial toxin. Such molecules can be used to kill specific cells such as cancer cells or the HIV infected cells. The Molecular Biology Section of this Laboratory has developed several of these molecules as anti-cancer agents, some of which are now in or at the end of the phase I clinical trial stage. In collaboration with the Molecular Biology Section, we examine these molecules and design mutations that will enhance the antigen binding, improve stability and yield, reduce immune response and reduce non-specific toxicity. The designs are tested experimentally by the Molecular Biology Section. (4) Gene discovery: In collaboration with the Molecular Biology Section, we have found some dozen new genes that are nearly specifically expressed in prostate or breast. The new gene discovery process begins with a list of EST clusters that we generate along with their tissue specificity data. For some of these, the structure of the product protein could be modeled, which gives information on possible biological function of the protein. We are now working on a new EST clustering method based on their alignment to the human genome sequence and on mining information from a new EST library from breast and prostate tumor cell lines that was made by the Molecular Biology Section.