Our overall goal is to reduce to practice an innovative new method for empirically identifying the optimal codon usage for any gene where the intent is to maximize protein accumulation, in either heterologous or homologous expression systems. In this Phase I project we explain how this method will work and demonstrate its usefulness by identifying highly expressing codon variants of a human Factor IX gene in both plant and mammalian expression systems. Our method is based on the observation that there is a high degree of correlation between accumulation of mRNA and protein in our plant transient expression system and there is literature support for such a correlation in mammalian and yeast systems as well. We recently found that when four divergent synonymous codon variants of one immunoglobulin heavy chain were expressed together in the same plant, the relative abundance of the four mRNAs was a close match for their relative abundance when expressed separately. From this we conceived of a new method that we call Massively Parallel Synonymous Codon Variant Screening (MPSCVS), which should allow the comparison of a very large number of different gene variants in a single experiment. Demonstration of our system will begin with a library of approximately 59,000 synonymous codon variants of Factor IX in an adeno-associated virus (AAV) gene therapy vector. We will use the AAV library to transduce the livers of mice in vivo. RNA will be isolated from the mouse livers. We will subclone an aliquot of the library into a plant expression vector and use that to transform Agrobacterium tumefaciens, which is used for plant (Nicotiana benthamiana) transformation. We will express the library transiently in tobacco leaves, harvest leaf tissue and isolate total RNA. The mRNA from both plants and human cells will be used to produce double stranded cDNA, which will be ?counted? by next generation sequencing to identify the most abundant mRNAs in both systems. Clones of high-expressing codon variants will be tested for protein and RNA expression individually in the appropriate expression system. We will then analyze the degree of correlation between RNA and protein expression and determine the overall efficiency of MPSCVS in identifying the best expressing variants.