One in 5 cancer cases worldwide is caused by infection (International Agency for Research on Cancer, 2002) and yet only seven viruses have been established to cause human cancers. Our laboratory discovered and characterized two of these agents, KS herpesvirus (KSHV/HHV8) and Merkel cell polyomavirus (MCV), both of which were discovered through highly-directed genomic searches. These findings initiated new fields in cancer biology and provided new bases for improved diagnosis, treatment and prevention of cancers caused by these viruses. Traditional approaches to cancer genetics have focused almost exclusively on somatic cell mutations and large scale sequencing studies are likely to miss discovery of viruses that might be causing some human tumors. Over the past four years, a group of seven new human polyomaviruses (including MCV, known to cause Merkel cell carcinoma (MCC)) were discovered that encode T antigen oncoproteins and are credible candidate tumor viruses. Although most adults are chronically infected with these viruses, none were found through cancer genome anatomy project sequencing studies. The NCI Director's Provocative Question Initiative #12 arises from our recent MCV-related MCC studies showing that MCV, a common commensal skin infection, initiates tumors after specific mutations to the viral genome. This is a new mechanism for carcinogenesis that will not necessarily be found through cancer cell genome sequencing projects unless a highly directed search for nonhuman viral sequences is performed in a orderly fashion. To identify human cancer viruses, we developed digital transcriptome subtraction (DTS), a deep sequencing approach that depends on generation of high-fidelity sequence databases and provides quantitative data on tumor cell transcription. Although standard deep sequencing has been used by others to search for viruses, successful analysis is highly-dependent on specific technical skills and assumptions. In 2008, we used DTS to discover MCV mRNA sequences in MCC, allowing subsequent full viral genome sequencing and confirmatory studies. In this application, we maximize the likelihood for identifying a novel cancer virus (es) in hematologic malignancies through both directed and unbiased approaches. Our directed approach will survey a large panel of hematologic malignancies for presence of the new human polyomaviruses. Using information gained from the biology of MCV, we will determine whether tumor-specific viral mutation patterns are present that differentiate causal from incidental viral infections. Our unbiased approach will use DTS to examine highly-selected gold-standard cases of EBV-negative post-transplant lymphoproliferative disorder (EN-PTLD) for presence of novel viral transcripts. High PHRED-equivalent sequencing of EN-PTLD tumors to <1 transcript per million (TPM) level will be performed to identify novel viral transcripts. These data will be compared to DTS on EBV-positive PTLD and CD19+ peripheral B cells to 1) confirm virus transcript detection and 2) determine differential cellular gene expression patterns between EBV-negative and EBV- positive disease. We will also initiate an exploratory collaboration with the Pacific Northwest National Laboratory to develop the next generation of tumor virus discovery technology using unsupervised LC- MS/MS proteomics of whole EN-PTLD tissue samples. Comparisons of DTS (transcript) and LC-MS/MS (peptide) data, obtained from the same tissue samples, will allow a more precise subtraction of human sequence data to identify novel viruses present in tumors. With the completion of these aims, we anticipate being able to answer whether one of the six new human polyomaviruses contributes to hematolymphoid malignancies and we will determine whether or not EN- PTLD harbors a novel virus. This systematic approach provides the highest probability to find a new human cancer virus and we will develop new technologic approaches that can be widely used in future searches for infections in human cancer.