The goal of this project is to improve the accuracy of comparative modeling both in the 30-90% sequence identity range and in the 10-30% range. This will be accomplished by a multi-disciplinary team of six investigators in biophysics, mathematics, statistics, and computer science. Based on new statistical analysis of homologous protein structure pairs using graphical models (Jordan) and non-parametric Bayesian methods (Jordan, Dunbrack), Tompa will devise a coarse sampling procedure, based on backtracking and branch-and-bound algorithms, designed to search the space of homologous structures from a starting model produced by the Baker or Dunbrack groups. Tseng and Baker will develop extensions of quasi-Newton optimization methods specifically tailored to Monte Carlo Minimization trajectories. These methods will take advantage of information gained in local optimizations carried out earlier in the trajectory from neighboring regions of the landscape. With a large sample of locally minimized structures, Jordan will use response surface methodology and Gaussian processes to fit a surface to these local minima. A search on this surface then produces promising low-energy regions of the space that can be searched further with fine sampling methods, including tabu search (Baker). Further optimizations with block-coordinate descent methods (Tseng) will also be implemented. Ponder will test his recently developed polarizable multi-pole force field, while developing this force field further with a generalized-Born, surface-area solvation model. Dunbrack will benchmark the accuracy of predicted structures at all stages of the project. Predicted side-chain conformations will be compared to deposited coordinates as well as electron density calculations from the experimental structure factors (Dunbrack). Finally, the methods developed in this proposal will be applied to proteins implicated in cancer development, including those in DNA repair, apoptosis, and cell-growth signaling, with a priority on targets for cancer therapeutics. New structures from three Protein Structure Initiative centers will be used both as prediction targets (before they are solved) and as templates for prediction of structures of important biological or clinical interest.