The goal of this project is to create a unified technological infrastructure for high-throughput protein expression and purification that will be suitable for large-scale structural biology initiatives. Central to our approach is the use of multiple genetically engineered affinity tags. We are currently trying to establish what combination of affinity tags is most appropriate for our purposes and devise a strategy to utilize them with maximal efficiency. At the same time, we are striving to develop more reliable methods for removing affinity tags, which is another critical element of our strategy. One of the greatest technical obstacles that we face is "the inclusion body problem"ui.e., the tendency of proteins to accumulate in an insoluble, inactive form. Because refolding of proteins seems incompatible with high-throughput applications, some means of circumventing the formation of inclusion bodies is needed. Sometimes this can be accomplished by fusing an aggregation-prone polypeptide to a highly soluble partner. To study this phenomenon in greater detail, we compared the ability of three soluble fusion partnersumaltose-binding protein (MBP), glutathione S-transferase, and thioredoxinuto inhibit the aggregation of six diverse proteins that normally accumulate in an insoluble form. Remarkably, we found that MBP is a much more effective solubilizing agent than the other two fusion partners. Moreover, in some cases we were able to demonstrate that fusion to MBP can promote the proper folding of the attached protein into its biologically active conformation. This chaperone-like quality distinguishes MBP from other affinity domains and greatly enhances its value as a fusion partner. Accordingly, MBP fusion proteins have become a cornerstone of our strategy for protein expression. We are using a variety of experimental approaches to try to understand how MBP influences the folding of its fusion partners. At the same time, to improve its utility as a fusion partner, we are attempting to endow MBP with an engineered affinity for additional ligands. Affinity tags would probably be used more often if it were not so difficult to remove them. This is usually accomplished by endoproteolysis of a fusion protein at a designed site. The main difficulty with this approach stems from the intrinsically promiscuous activity of proteolytic reagents that are commonly used to cleave fusion proteins. This problem is compounded by the fact that it is very expensive to purchase enough of any of these reagents to cleave fusion proteins on a scale amenable for structural studies. To overcome these problems, we are producing our own supply of TEV protease, the catalytic domain of the nuclear inclusion protease from tobacco etch virus. TEV protease cleaves the amino acid sequence ENLYFQG between Q and G with high specificity; in contrast to factor Xa, enteropeptidase and thrombin, there have never been any reports of cleavage at noncanonical sites by TEV protease. The production of TEV protease in Escherichia coli has been hampered in the past by low yield and poor solubility, but we have been able to solve both problems by making synonymous codon replacements and producing the protease in the form of an MBP fusion protein. A more troublesome shortcoming of TEV protease is that it cleaves itself at a specific site, yielding a truncated protease with greatly diminished activity. We have been able to rectify this problem as well by introducing amino acid substitutions that prevent autoinactivation without impeding the ability of the protease to cleave canonical target sequences. Further improvements (e.g., modifications that increase stability or alter specificity) may be possible once the structure of TEV protease has been solved or a genetic assay for protease activity in vivo has been developed. Although most of our effort still revolves around the mechanics of tagging and untagging, we are already thinking about how to blend these elements into a cohesive process for generic protein expression and purification that can be evaluated and modified as needed. One of the early challenges we will face is the physical engineering of a functional multitag. We believe that the unique properties of MBP justify making it the focal point of our strategy. The question is how to incorporate additional tags within the framework of an MBP fusion protein in a manner that will enable them to function as they are intended to while not interfering with each other, especially with MBP?s ability to enhance the solubility of its fusion partners. A related issue involves the tagging of the protease component that will be used in the process, to facilitate its separation from the target protein following proteolytic cleavage of the fusion protein. Theoretically, tagging of the target protein and protease can be coordinated to simplify the process and improve efficiency. Once these tools are in place, we will be able to evaluate the strengths and limitations of this affinity tag-based approach for high-throughput protein expression and purification.