The relationship between amino-acid sequence and conformation in proteins is being investigated. Our studies depend on the assumption that a knowledge of the conformation of di-and tripeptide sequences as they occur in proteins is intrinsically more valuable in predicting conformation than are comparable analyses based on single aminoacid residue conformation alone. We have a lexicon in which extensive information is stored about dipeptide amino-acid sequences in proteins of known structure. This can be added to as new protein structures are solved. From the frequency of occurrence of any one dipeptide in a particular conformation (precisely defined) weighting functions have been derived which can be employed to predict conformation from sequence by consideration of overlapping dipeptide pairs. Studies in progress will refine these functions and provide bridging weights in regions with dipeptides of infrequent general occurrence. Tri- and tetra-peptide multiple identities will similarly be coded, entered into the lexicon and employed.