A complete and accurate collection of all expressed genes is critical for genomic and proteomic research of any genome. However, despite significant efforts to clone all the expressed human genes, approximately 20% of the predicted protein-coding genes are incomplete or not found in reference cDNA collections. The instability or toxicity of numerous protein-coding genes in E. coli is likely to play a major role in this problem. The goal of the current research proposal is to develop linear and circular shuttle vectors to clone these otherwise "unclonable" genes into E. coli and mammalian cells. The vectors will be transcriptionally silent in E. coli, but they will allow active expression in mammalian cells. A new linear vector that shows high stability for cloning otherwise unstable DNAs will be modified to prevent fortuitous expression in E. coli. Further, it will be coupled with signals to direct mammalian transcription and translation. A major benefit of this project is that it will allow researchers to obtain and express a wide variety of genes from the human and other genomes that are currently "unclonable". Such genes are likely to have large coding regions (e.g., >10 kb), unusual base composition, or repetitive sequences. Significantly, these vectors will allow analysis of unstable regions of the human genome, which are the cause of numerous heritable diseases. Future derivatives will include systems specifically for bacterial expression of large, toxic, or unstable cDNAs. PUBLIC HEALTH RELEVANCE: A complete and accurate collection of expressed genes is critical for genomic and proteomic research of any genome. Despite significant efforts to discover and clone all the expressed human genes, approximately 20% of the potential protein-coding regions are incomplete or not found in reference cDNA collections. The inability to clone numerous protein-coding genes into E. coli is likely to play a major role in this problem. The goal of the current research proposal is to develop linear and circular shuttle vectors to clone otherwise "unclonable" genes for expression in mammalian cells. The vectors will be transcriptionally silent in E. coli, but active in mammalian cells. A new linear vector that shows high stability for cloning otherwise unstable DNAs will be modified to prevent fortuitous expression in E. coli. Further, it will be coupled with signals to direct mammalian transcription and translation. A major benefit of this project is that it will allow researchers to clone genes that have large coding regions (e.g., >10 kb), unusual base composition, or repetitive sequences. Future research will include the development of similar vectors specifically for prokaryotic expression of large or toxic cDNAs. [unreadable] [unreadable] [unreadable]