Cancer continues to exact a massive socio-economic burden on the United States and world community and early detection and targeted therapy remain the core dual-goals of all cancer research. Cancer results from accumulated somatic and/ or germline mutations, and several genomic aberrations have emerged as successful diagnostic/ prognostic markers and therapeutic targets such as the BCR-ABL1 gene fusion in Chronic Myeloid Leukemia, PDGFR mutation in gastrointestinal stromal tumors, ERBB2 amplification in breast cancers, and EGFR mutations in lung cancers etc. In order to systematically discover key genetic aberrations in cancers, genome-wide sequencing of candidate genes has been undertaken(1-5) and based on these global molecular analyses, it has been argued that key cancer genes act in concert with a battery of other diverse genomic aberrations(6). The recent advent of next-generation sequencing platforms (7) has made it possible to address the ambitious goal of delineating the landscape of cancer genome aberrations (8, 9). While whole genome sequencing throughput continues to evolve, cancer genomic studies that often need analysis of scores, if not hundreds, of samples still face practical bottlenecks. Basically, cancer genome is often highly aneuploid (aberrant chromosome numbers) or polyploid (with aberrant sets of chromosomes), and almost always highly rearranged, with several areas of gains and losses (10). To adequately analyze these complex sequences, requires extra deep coverage of the genome that is not yet routinely feasible (or economical) over large sample sizes. Therefore, we have considered a complementary approach of focusing on the 'expressed' component of the genome, namely the transcriptome. Sequencing the transcriptome provides an in depth coverage of the genomic coding sequences, as well as serves as a direct readout of gene expression, alternatively spliced isoforms, chimeric transcripts and mutations, thus enriching the data for 'functional' aberrations. We have recently applied transcriptome sequencing to discover multiple novel gene fusions and RNA chimeras in prostate cancer, including the discovery of a recurrent chimera, SLC45A3-ELK4 in a subset of prostate cancer tissues (11). Subsequently, in a proof of concept study we have improved our technique by developing the method of 'paired end transcriptome sequencing' to systematically identify gene fusions and chimeric transcripts in cancers (12). We are now focused on applying transcriptome sequencing to discover recurrent gene fusions and other transcript aberrations in pancreatic cancer, the 4th most common cause of cancer related deaths in the US, with the worst prognosis of all major malignancies (5 year survival < 3%), making it a major public health concern and an exquisitely challenging bio-medical research problem(13). The aim of this proposal is to discover novel cancer-specific, recurrent gene fusions and other signature genetic/ transcriptomic aberrations in pancreatic cancer that could be characterized further to develop early diagnostic markers and therapeutic targets. The specific aims are to 1. Generate high throughput transcriptome sequencing data from pancreatic cancer cell lines, pancreatic adenocarcinomas and matching normal tissues. 2. Bioinformatically identify cancer-specific, recurrent gene fusions, chimeric transcripts, non synonymous coding mutations, and gene expression signatures in pancreatic cancer; validate candidate aberrations and screen larger sample cohorts to determine recurrence and to 3. Functionally validate novel, recurrent or potentially driver aberrations with clinical implications. Overall, we envision discovering a pathognomonic gene fusion or other transcript aberrations in pancreatic cancers a la BCR-ABL1 in CML or TMPRSS2-ERG in prostate cancers, and provide a general roadmap for similar discoveries in other common cancers. PUBLIC HEALTH RELEVANCE: Characterization of key genetic aberrations in cancers holds the key to the development of early diagnostic markers and effective therapeutic targets. High throughput whole genome sequencing applications represent the most powerful tools to address this problem, but are limited by logistical and analytical considerations. Therefore this proposal seeks funding to adapt the high throughput next generation sequencing applications to develop a complementary approach of 'transcriptome' sequencing analyses to identify novel, recurrent gene fusions and other transcript aberrations in cancer that can be further characterized to develop early diagnostic markers and novel therapeutic candidates. As a test case, we propose to analyze pancreatic cancer, which is the 4th most common cause of cancer related deaths in the US, with the worst prognosis of all major malignancies (5 year survival < 3%).