ABSTRACT Why do eukaryotic membrane proteins express so poorly in E. coli? This issue has created a significant barrier in the effort to study mammalian membrane protein structures. E. coli is the tried-and-true over- expression system for protein purification, which has made structural biology accessible to countless laboratories. Still, there are only a handful of eukaryotic membrane protein structures determined so far and the major reason for this is the lack of an economical and easy expression system like E. coli. If expression could be carried out in E. coli, then this would improve our ability to investigate mammalian membrane protein structures, especially in light of recent revolutionary developments in single-particle cryo-electron microscopy. There is ample evidence that eukaryotic membrane proteins do express in E. coli, but that in most cases the yields are incredibly low and it is uncertain if the protein is functionally folded. However, by tagging targets with GFP at the C-terminus, it is possible to observe single-molecule protein expression directly in E. coli by oblique-angle fluorescence microscopy using sensitive EM/CCD detectors. With this approach, we can determine if expression occurs in the membrane vs. the cytoplasm or inclusion bodies based on single-particle tracking to measure diffusion. Furthermore, isolation of vesicles from E. coli membranes allows for single- vesicle functional measurements even with low levels of expression. Thus, it is possible to study the remarkably low-levels of expression of eukaryotic membrane proteins in E. coli and interrogate whether the production of functionally folded protein can be optimized through genetic manipulation. One reason why eukaryotic sequences may fail is due to improper coding of co-translational folding, i.e. a hidden genetic code that couples the timing of translation with partitioning and folding in the lipid bilayer. On target systems, we will investigate changes in expression and function while comparing: (a) codon usage including E. coli optimized vs. native codons and conservation of rare codon clusters, (b) N-terminal protein sequences and (c) conservation of pause sites such as Shine-Dalgarno elements (prokaryotes) or Alu motifs (eukaryotes). For each of these variables, we will generate chimaeras between homologues that express and those that fail in order to identify which elements lead to successful expression. We will examine three distinct membrane protein families that already have structures for both prokaryotic and eukaryotic homologues: (i) the CLC family of Cl-/H+ transporters and Cl- channels, (ii) Aquaporin water channels and (iii) 7TM receptor family of membrane proteins, including GPCRs. Finally, we will design a standalone program that will allow for simple alignment of gene sequence, protein sequence and structural elements simultaneously. The end goal of this project is to develop an optimization algorithm that will allow any scientist to take a poorly expressing eukaryotic membrane protein in E. coli, and increase expression to yields that will facilitate biochemistry studies such as structure determination by cryo-EM.