The catalog of human splice variant mRNAs is believed to be far from complete with many undiscovered isoforms likely to play an important role in health and disease. The current practice of hunting for splice variants by screening cDNA libraries one at a time is both time consuming and expensive. Higher throughput and more cost-effective methods of screening would be helpful in accelerating this important enterprise. The aim of this project is to create a valuable research tool for the rapid identification of alternative splice isoforms. The Library Sampler described in this proposal represents a collection of high-quality cDNA libraries organized in two panels of 384 low-complexity pools that can be used to identify splice variants, polymorphisms and mutants from a broad range of normal tissues, tumors and cancer cell lines. The main advantage of this proposed work is to allow a simple PCR approach to efficiently screen a large diversity of human sources represented by over 9 million cDNA clones. Due to the low complexity of the cDNA pools the PCR products can be sequenced directly and used in sub-cloning experiments, giving the researcher an immediate answer on a set of mRNA isoforms corresponding to the gene of interest. A graphical web interface will be developed using a comprehensive eukaryotic splice variant database to assist scientists with their experimental design, and a set of novel splice isoforms from a list of selected genes will be identified to demonstrate the value of this new discovery tool. The Library Sampler will provide the researcher community with a powerful system to quickly identify all variant transcripts expressed from a single gene. Cataloging all the mRNA isoforms will be fundamental for understanding the complexity of the human proteome, and since many of these variants have been associated with genetic diseases, their discovery will offer new potentials for drug development and molecular diagnostics. In October 2004 the International Human Genome Sequencing Consortium reduced the estimated number of human protein-coding genes to only 20,000 to 25,000 genes, a surprisingly low number for our species, suggesting that gene regulation is far more important than gene number. It is well known that most of our genes produce more than one protein by alternative splicing of their primary transcript, but the identification of all these splice variant forms is far from complete. The proposed Library Sampler provides a powerful research tool to rapidly identify these splice isoforms from a broad range of normal tissues, tumors and cancer cell lines, and because many splice isoforms have been associated with human genetic diseases, their discovery will offer new potentials for drug development and molecular diagnostics. [unreadable] [unreadable] [unreadable]