Protein folding is the process by which polypeptides adopt their complex, three dimensional structure. In most monomeric proteins, this structure is required for function, and is encoded in the amino acid sequence. Thus, protein folding is the bridge between the gene and its function, and is central to understanding biology. Deciphering the rules by which proteins fold is also critical for understanding a number of genetic diseases that result either from essential gene products that cannot fold to their native state, or from proteins that misfold to a non-native, aggregation-prone complex, forming toxic oligomers or fibers. The research proposed here seeks to understand the folding problem using proteins of a simplified architecture in which a small cluster of secondary structure units (helix, strand, turn) is repeated in a linear array. The extended, modular architecture of repeat proteins allows units of structure to be removed and inserted, providing a detailed mapping of how folding energy is distributed along the polypeptide chain. This direct mapping of the energy landscape allows long-standing questions about protein folding to be addressed, such as the origin and kinetic consequences of cooperativity, existence and specification of kinetic pathways. In addition, the structural similarity of the repeated units allows the contributions of different regions to be compared with great clarity. Here we use two different repeat protein architectures, the ankyrin (a/a) and LRR ([unreadable]/non-[unreadable]) repeats to explore the structural origins of cooperativity, the role of cooperativity in folding kinetics, and how bulk cooperativity is manifested when unfolding is promoted by a directed force. To rigorously quantify cooperativity and its structural origins, we will take advantage of a recent discovery by us and by other groups that stable arrays can be built of repeats of identical sequence. These "consensus arrays" will be analyzed using an "Ising" statistical model, which quantifies intrinsic versus nearest-neighbor energies. Consensus sequence variants will be used to resolve which types of interactions give rise to the extraordinary cooperativity we have seen in these proteins. Once we have variants in hand that resolve local versus long-range interactions, we will be able to probe how cooperativity influences kinetics and transition state ensembles, developing a kinetic Ising model in the process. Kinetic analysis of these proteins will also provide insights as to how folding proceeds on a genuinely "flat" landscape. These cooperativity variants will also be used to explore the relationship between solution cooperativity and end-to-end forced unfolding. Comparison to natural (nonconsensus) repeat arrays will provide continued insight into the relationships between sequence, stability, and folding in these simple but ubiquitous proteins. Studies will combine standard equilibrium and stopped flow folding with collaborative hydrogen exchange mass spectrometry, atomic force microscopy, and optical tweezer methods. PUBLIC HEALTH RELEVANCE: A large number of human diseases including cancers and Alzheimer's disease are caused by proteins that cannot fold up to their active shapes, or that fold to the wrong shapes, poisoning cells and tissues. The proposed research will use simplified "repeat" proteins to learn the rules of how proteins fold into unique, well-determined structures. These rules will help us to understand the causes of "folding diseases", and will also provide new biomaterials that can be used to diagnose and perhaps ultimately treat human diseases.