Infection contributes to ~20% of human cancers worldwide. The list of known carcinogenic infectious agents, however, is surprisingly short and no new human tumor viruses have been discovered over the past decade. Current tumor virus discovery methods are not comprehensive and are likely to miss discovery of new families of agents. Negative studies using these techniques do not rule out the presence of a tumor virus and may miss a previously unknown agent. We propose modifying long Serial Analysis of Gene Expression (SAGE) as an unbiased means for virus discovery using in silica digital transcript subtraction (DTS). We show here a practical method to perform transcriptome-wide in silico subtraction of short transcript tags, allowing discrimination between human and nonhuman sequences. Once a candidate tumor virus sequence is found, it can be used as a start point for viral genome walking and characterization. To demonstrate the feasibility of this method, we performed pilot studies of DTS on a tumor cell line infected with latent KSHV virus. DTS rapidly and uniquely identified 5 KSHV transcripts de novo comprising 0.44% of the total cell transcriptome. The technique was quantitatively reproduced by spiking KSHV-infected cell line RNA into uninfected human tumor tissue RNA. We also identified practical cut-off levels that distinguish most human polymorphisms from viral SAGE tags using DTS. Finally, we performed pilot DTS on 3 squamous cell conjunctival carcinoma (SCCC) tumors, an immunodeficiency-related malignancy. In silico subtraction of 108,000 SAGE tags generated 46 candidate sequences, including 12 high probability tags that are being evaluated as possible SCCC agent sequences. We show that this technique is surprisingly immune to RNA degradation so that it can be used on rare or archival materials in which partial RNA degradation has occurred. We seek phase II R33 funding to perform DTS on 4 SCCC samples, extending our pilot studies into a full analysis of SCCC at the 5-10 transcripts per million level. This will allow us to either identify or exclude a likely tumor virus causing this immunodeficiency-related cancer. This also completes development of DTS technology, allowing us to fully optimize its performance for application to other suspected infectious tumors [unreadable] [unreadable] ASSESSMENT: [unreadable] [unreadable] [unreadable]