The overarching goal of this program is the development of a database in which all the data generated will be made easily accessible to the scientific community. The database needs to be complete to appreciate its value for future scientific research. Underlying the database development are two components, one, to assess the impact of the range of current (and already 2nd generation) sequencing and newly emerging technologies and applying it to bona-fide tissue samples in a large enough scale that the power of the results is sufficient to allow conclusions to be drawn. The data that will be generated at the completion the project's completion will provide much needed information that will allow the evaluation of these technologies and inform future directions for TARGET and large scale genomic cancer research overall. This project is done in collaboration with the NCI OHAM. The tumor that was selected was diffuse large B-cell lymphoma (DLBCL). There are 92 cases of DLBCL from general population. The processing, quality control and nucleic acid isolation for all DLBCL cases is covered with these funds. The characterization that was undertaken is sequence-based whole genome (of tumor and normal DNA) and transcriptome (RNA). This scope of sequencing allows the discovery of mutations both in coding and non-coding genomic regions, as well as determination of gene expression profiles and genomic alterations (including translocations, insertions and deletions). The 2nd generation sequencing technology utilized for this project is highly cost-effective, high-throughput and efficient.