The functions of proteins depend exquisitely on their structure, with details at the 0.1 scale influencing enzyme catalysis, disease-causing mutations, and drug recognition. For this reason, having detailed and accurate structures of proteins is a cornerstone of modern biomedical research, and the NIH funded the Protein Structure Initiative with the goal of obtaining models for every protein structure with an accuracy approaching that of a high-resolution crystal structure. Current technology for template-based modeling is powerful, but cannot yet deliver near-crystal-structure quality. Tests show that the best minimization routines still fall short of consistently producing protein models for close homologs that approach within ~1 rmsd of the 'native' structure as ultimately revealed by crystal structures. To help break through this 1 barrier, during the previous period of support we used ultrahigh-resolution structures to create a library of conformation- dependent ideal geometry functions for the protein backbone, and showed that its use improves the quality of protein crystal structures and holds promise to improve template-based model refinement. We also discovered that ultrahigh-resolution crystal structures are a rich source of details about protein structure that are not accurately attainable from structures in the ~1.5-2 resolution range and thus have not yet been fully accounted for in current energy functions. Here, our central hypothesis is that a major step forward in template-based modeling accuracy will come from identifying and explicitly taking into account detailed features of protein covalent geometry, conformation and non-covalent packing interactions that have not yet been characterized, and can now be gleaned from the study of highly accurate ultrahigh-resolution protein structures. The overall goal of our proposal is to mine such information so it can be used to improve the accuracy of predictive modeling. With many ultrahigh-resolution structures now available, the time is ripe to achieve this goal by pursuing three specific aims related to (1) extending the impact of the 'ideal geometry function' paradigm by creating, optimizing, and implementing conformation- dependent libraries accounting for peptide planarity, side chains, and cis-peptides, (2) mining ultrahigh- resolution crystal structures to glean information for next-generation empirical energy functions, and (3) analyzing ultrahigh-resolution protein structures solved in varying environments to produce a set of benchmark test cases and developing residue level assessment tools to use with these test cases to evaluate and hone template-based modeling refinement applications. This proposed work is low cost and low risk, and has a high likelihood of substantial impact as it provides basic information that can be widely incorporated into predictive and experimental modeling applications to improve their accuracy. It is also distinct from major efforts being invested into template-based modeling. Introducing this greater level of realism is a prerequisite to improving the refinement step of template-based modeling and achieving the goals of the Protein Structure Initiative. PUBLIC HEALTH RELEVANCE: Proteins carry out the work that gets done inside of cells, so figuring out what they look like helps us understand things like how drugs work and how to design new drugs that will work even better. It is not practical to experimentally determine every protein structure, so having reliable ways to use computers to predict their structures is quite important. Current methods are not quite accurate enough, and the goal of this work is to look carefully at the best known protein structures to learn from their exact features how we can improve prediction technology to get the details right.