Sequence-specific interactions between proteins and DMA are critical for the expression and maintenance of genomic information. Although genome-wide sequencing projects yield a parts list for protein-DNA interactions, directly in the form of potential binding sites and indirectly in the form of inferred protein sequences, these data do not directly speak to the interactions between these parts. Our long-term goal is to develop a quantitative model for assessing the sequence-specific binding preferences and affinities of DMA-binding proteins. The specific hypothesis behind this research is that accurate modeling and engineering of protein-DNA interactions requires treatment of both DMA and protein flexibility, and physically realistic scoring functions. This is based on several observations. First, purely sequence-based models are only partially successful in predicting binding preferences. Second, structural data indicate that a single amino acid at a well-defined position in a protein-DNA interface can participate in context-dependent interactions due to conformational freedom. Thus, a side chain's ability to move contradicts the common assumption made by sequence-based methods that an amino acid at a given position has a single mode of action. Finally, examination of crystal structures of homologous protein-DNA ^complexes reveals that changes in protein and DNA sequences are accompanied by modest but significant structural rearrangements. The specific aims in this proposal are designed to generate a physical model for protein- DNA interactions and to test experimentally the predictions made by this model: 1. To construct a physical model for describing the basis for specificity in protein-DNA interfaces. 2. To generate a model, using physical chemical and statistical mechanical considerations, for the prediction of binding sites for the winged helix family of transcription factors found in two-component signaling pathways. 3. To redesign the DNA binding and cleavage preference of the lAnil homing endonuclease to target genes in Anopheles gambiae, the primary vector for malaria.