The relationship between amino acid sequences, protein structures, and folding kinetics will be probed using a combination of combinatorial library selection methods, biophysical characterization, and computational analysis. Folded proteins will be retrieved from high complexity libraries using phage display and in vitro selection methods in conjunction with a novel "loop entropy reduction" based selection for folding. The selection will be applied both to ensembles of random amino acid sequences constructed from synthetic cassettes, and to amino acid sequences constructed by splicing together approximately 20 residue segments of naturally occurring proteins. The structures of folded proteins retrieved from these libraries will be characterized to investigate how the set of naturally occurring proteins compares to the universe of protein structures accessible to polypeptide chains. In complementary experiments, rational computer based design methods will be used to design large ensembles of sequences likely to fold into specific novel topologies not found in nature, and properly folded proteins will be selected from libraries encoding these sequences using the loop entropy reduction and topology-specific binding selections. The formulation of the potential function and the sequence search strategy used in the library design are natural extensions of our relatively successful approach to ab initio protein tertiary structure prediction. The kinetics of folding of the novel proteins recovered in all of the above selections will be measured and compared to those of naturally occurring proteins with similar lengths and contract orders to determine the extent to which natural selection has operated on folding kinetics. These more general studies will be complemented by a continuation of focused studies on the folding mechanism of the 62 residue IgG binding domain of Protein L using NMR, kinetics and single molecule detection methods. These studies aim to characterize the distribution of structure in the denatured state under as close to physiological conditions as possible., to characterize the rate limiting step in folding in detail, and to identify the origins of the symmetry breaking during folding evident in earlier studies of Protein L folding. Taken together, these studies should illuminate some of the least well understood aspects of sequence/structure relationships in proteins, and contribute to the understanding of the evolutionary history of protein domains as well as to improvements in protein design and protein structure prediction.