The purpose of this project is to determine the role that repetitive elements (REs) play in the biological outcome of environmental exposures. While it is known that the expression of REs change in response to environmental agents, mechanistic insights into the impact of REs on the biology of cells and organisms is an area of research that has not been explored in depth. We are specifically interested in studying the extent to which REs alter the expression of adjacent genes through the formation of fusion transcripts (FTs). We chose to use RNA-seq to study this problem, and we have developed a robust analytical pipeline to detect FTs. We have analyzed a data set associated with cocaine exposure, which was selected based on the fact that our collaborator, Dr Eric Nestler, previously demonstrated that the expression of REs is altered in the brains of mice treated with cocaine. Also, we reasoned that the identification of FTs that are responsive to cocaine could provide a link between FTs and environmental exposures. In this past fiscal year we have finalized our analytical strategy to identify genome-wide FTs using RNA-seq data. The work describing the strategy has recently been published (Wang et al., 2016) and is now available to the scientific community. More recently and to follow-up on this initial work, we have focused our efforts on establishing whether FTs have biological functions that may impact organismal physiology. To this end we have concentrated on two genes that express FTs: Pgc1alpha and Arhgef10. Pgc1alpha, or peroxisome proliferator gamma co-activator 1, is a master metabolic regulator that was identified back in 2004 as a co-transcriptional activator of the mitochondrial biogenesis program. Arhgef10 is a rho guanine nucleotide exchange factor that ultimately regulates the actin cytoskeleton in a way that can influence cellular morphology, migration, and cytokinesis. We cloned the different canonical and FT isoforms of both genes (with and without a C-terminal tag) with the goal of producing infectious viral particles to express each isoform in cultured cells and in the brains of mice. Both the cells and mice will be phenotyped to evaluate whether the FT isoforms have specific biological functions. For Arhgef10, our initial analysis (RNA-seq and quantitative PCR) identified, upon cocaine exposure, a 40% increase in the levels of one of the two FTs isoforms identified for Arhgef10. Preliminary results indicated that stereotaxic infection of viral particles expressing one FT from Arhgef10 in the brain of male mice can blunt cocaine reward behavior. We will repeat these in vivo experiments using the additional FT constructs and controls. Additionally, we will then use the cell culture models to characterize whether the FTs give rise to new protein isoforms with biological functions that may be different from the canonical isoform of the gene. For Pgc1alpha, our analytical strategy identified two repeat-containing isoforms of Pgc1alpha in the mouse brain: one involving a simple sequence repeat (SSR), about 500 Kb upstream from the canonical promoter that splices to exon 2 of the gene. The second fusion isoform involves the same SSR that splices to a SINE (small interspersed nuclear element) that is about 250 Kb downstream from it, which in turn splices to exon 2. Analysis of various publicly available RNA-seq data sets revealed that both of these new FT isoforms are brain specific. Moreover, within the brain, the SSR-SINE-exon 2 isoform seems to be the only isoform of Pgc1alpha expressed in neurons, while the SSR-exon 2 isoform is expressed only in oligodendrocytes. We also analyzed publicly available ribosomal profiling data sets and found evidence that the SSR-containing isoforms are actively translated in the brain. Additional support that these new FTs make proteins come from our work in which the cloned SSR-SINE-exon 2 isoform containing a myc tag gave rise to a protein of the expected size as judged by Western blot. We recently generated mice that specifically delete the two FT expressing Pgc1alpha in the brain by deleting either the SSR or the SINE using the CRISPR/cas9 technology. While the former should delete the two brain isoforms, the SINE-deletion should delete only the neuronal form. Because neither of these isoforms is expressed in other tissues, we expect to be able to dissect the role of these FTs in the brain. We do not know yet whether FT-derived proteins, which we predict will have different N-termini, will have the same biological function as the proteins derived from the canonical isoforms of Pgc1alpha. Interestingly, data in the literature proposes that the N-terminal region of PGC1alpha regulates the choice of downstream targets. Therefore, we are testing whether the proteins encoded by the FTs have different downstream targets. We are predicting that the protein expressed from the SSR-SINE-exon 2 FT replaces the 16 amino acids coded by exon 1 with 6 coded encoded by the downstream SINE element. We are also predicting that the protein expressed from the SSR-exon 2 FT would have an N-terminus encoded by 29 residues within the SSR. These possibilities will be first tested in our cell culture model overexpressing these proteins and then confirmed in the animal models.