Protein folding is a crucial step in the expression of genetic information as functional molecules. Understanding this process, however, is obscured by the conformational complexity of proteins, our incomplete understanding of the determinants of folding, and the huge number of possible protein sequences. A statistical approach to understanding sequence-structure compatibility is being developed that can encompass all possible sequences. The method can assist researchers in (a) probing the sequence variability of protein structures, (b) focusing protein sequence libraries, and (c) designing sequences that fold to a predetermined structure. The theory will also be used to address the variability observed among naturally occurring protein sequences having a common structure. Lastly, information-based energy functions play a central role in protein design and structure prediction. A new class of such functions will be developed. The ability to understand the sequence variability of particular structures and to determine sequences that fold to desired conformations will advance our knowledge of genetic disorders, protein folding diseases and could lead to new types of protein based therapeutics.