The proposal "The Crystallography of Macromolecules" addresses the limitations of diffraction data analysis methods in the field of X-ray crystallography. The significance of this work is determined by the importance of the technique, which generates uniquely-detailed information about cellular processes at the atomic level. The structural results obtained with crystallography are used to explain and validate results obtain by other biophysical, biochemical and cell biology techniques, to generate hypotheses for detailed studies of cellular process and to guide drug design studies - all of which are highly relevant to NIH mission. The proposal focuses on method development to address a frequent situation, where the crystal size and order is insufficient to obtain a structure from a single crystal. This is particularly frequent in cases of large eukaryotic complexes and membrane proteins, where the structural information is the most valuable to the NIH mission. The diffraction power of a single crystal is directly related to the microscopic order and size of that specimen. It is also one of the main correlates of structure solution success. The method used to solve the problem of data insufficiency in the case of a single crystal is to use multiple crystals and to average data between them, which allows to retrieve even very low signals. However, different crystals of the same protein, even if they are very similar i.e. have the same crystal lattice symmetry and very similar unit cell dimensions, still are characterized by a somewhat different order. This non-isomorphism is often high enough to make their solution with averaged data impossible. Moreover, the use of multiple data sets complicates decision making as each of the datasets contains different information and it is not clear when and how to combine them. The proposed solution relies on hierarchical analysis. First, the shape of the diffraction spot profiles will be modeled using a novel approach (Aim 1). This will form the ground for the next step, in which deconvolution of overlapping Bragg spot profiles from multiple lattices will be achieved (Aim 2). An additional benefit of algorithms developed in Aim 1 is that they will automatically derive the integration parameters and identify artifacts, making the whole process more robust. This is particularly significant for high-throughput and multiple crystal analysis. In Aim 3, comparison of data from multiple crystals will be performed to identify subsets of data that should be merged to produce optimal results. The critical aspect of this analysis will be the identification and assessment of non- isomorphism between datasets. The experimental decision-making strategy is the subject of Aim 4. The Support Vector Machine (SVM) method will be used to evaluate the suitability of available datasets for possible methods of structure solution. In cases of insufficient data it will identify the most significant factor that needs to be improved. Aim 5 is to simplify navigation of data reduction and to integrate the results of previous aims with other improvements in hardware and computing. PUBLIC HEALTH RELEVANCE: The goal of the proposal is to develop methods for analysis of X-ray diffraction data with a particular focus on the novel analysis of diffraction spot shape and the streamlining of data analysis in multi-crystal modes. The development of such methods is essential to advance structural studies in thousands of projects, which individually are important for NIH mission.