Zinc finger proteins of the Cys2His2 type consists of tandem arrays of 28- 30 amino acid domains, each of which binds a zinc ion. It appears that almost all of these proteins bind double stranded DNA in a sequence specific fashion. This is a very large and important protein family. It is estimated that the human genome may encode up to 1000 proteins of this family, many of which appear to play a role in development. Moreover, changes in genes encoding proteins of this family have been associated with several different types of cancer. From crystallographic and other studies, it appears that each domain contacts three base pairs of DNA and these triplet subsites are directly abutted. This modular construction suggests that these proteins could be used for the design of sequence specific DNA binding domains once the relationships between the amino acid sequence of the protein domain and the nucleotide sequence of its subsite have been established. Some of these relationships have been developed through previous studies, particular for G/C rich binding sites. Through mutational and binding studies, these relationships will be extended to include as many binding sites as possible. Furthermore, the degree of discrimination available from particular protein-nucleotide contacts will be determined. This discrimination varies considerably from site to site and can be correlated with the nature of the structural interaction. These recognition rules used in two ways. First, these allow the prediction to binding sites from protein sequence. These predictions will be tested as binding sites are determined from additional proteins. Second, these rules can be used to design novel sequence specific DNA binding proteins. These designed proteins, which will be based on a fixed structural framework with variation of the four key residues associated with DNA interactions, will be characterized in terms of DNA binding specificity, overall affinity, and structure. A variety of applications can be conceived for these designed sequence specific binding proteins.