Structural biology provided us with tens of thousands of examples of protein three- dimensional structures, which gave molecular level information about many important biological processes. Coordinates of all these structures are deposited to a Protein Data Bank, where they are available for downloads and further analysis. Most of PDB users are interested in specific proteins relevant to a particular biological problem, but the same data can also be analyzed not for specific features of individual proteins, but to identify empiricl rules describing protein structures, which in turn can be used in protein structure predictions and simulations. The latter approach was used successfully to derive empirical potentials and rules that are used in programs such as Rosetta. However, a significant amount of information contained in the data available in the PDB is still untapped, mostly because of lack of adequate software tools. Here we propose to mine the PDB for information contained in structures of closely related or even identical proteins. We argue, that such cases, removed as redundant in most extant analyses of protein structures, provide unique, even that indirect, information about protein flexibility and specifically, ways and directions in which various protein folds change the structure in response to mutations (evolutionary flexibility), or to activation or ligand binding (functional flexibility). As a result of our analysis, we anticipate deriving empirical rules about protein structural changes, rules that can be applied to speed up and/or direct simulations and can be used to predict conformations of proteins in functional states unavailable from the direct experiment. The main deliverable of this proposal would be a flexible comparative modeling toolkit, a series of algorithms and protocols which would allow the users to apply empirical rules of protein structure changes to their protein of interest. This toolkit would be available from the project website as a server, but also would be distributed as an open source software package. PUBLIC HEALTH RELEVANCE: In this project we propose to develop a flexible comparative modeling toolkit, program and server that would provide a simple and fast way to predict structures of proteins in other functional states, thus, providing a fast and cheap alternative to energy based extrapolations and refinements and contributing to faster drug development and easier functional analysis of important protein targets.