The development of leukemia has been observed in 5 out of 19 SCID-Xl patients treated by stem cell gene therapy. In all 5 patients the development of leukemia was due to insertional mutagenesis, i.e. the retrovirus- mediated activation of nearby proto-oncogenes. Thus, the careful study of integration sites and the contribution of individual clones to repopulation will be of crucial importance for all gene therapy studies, and the FDA has mandated the careful monitoring of retroviral integration sites in all clinical gene therapy studies. The Vector Integration and Tracking Core D will provide all projects a centralized facility to efficiently identify foamy virus (FV) vector integration sites in complex biological DNA samples from in vitro or in vivo studies. We will use recently developed improved non-restriction (nr)LAM-PCR. The core will be able to process a variety of samples from murine, dog, or human to generate DNA for shuttle vector or PCR- based methods. For nrLAM-PCR, as little as lOng of DNA can be used to carry out FV vector integration site amplification and extended processing in preparation for sequencing. The samples processed by Core D will be compatible with both Sanger-based sequencing for pilot experiments and new sequencing methodologies, such as pyrosequencing, for deep sequencing to identify all of the amplifiable FV vector integration sites in a given sample. Core D will also provide assistance to all projects to design primers specific to FV vector integration sites to allow for DNA-based real time (RT)-PCR tracking to assess the contribution for individual clones that are deemed important for further investigation. In addition, Core D will provide a centralized facility to analyze integration sites via a common gateway interface (CGI)-PERL web server. Current versions of the human (hg19), dog (canFam2) and mouse (mm9) genomes will be supported. The Core will also provide support to investigators through PERL programs to correlate FV vector integration sites with data from published databases including proto-oncogene TSS, microarray data and over- represented gene classes. The bioinformatics component will also generate random datasets for all three genomes, human, mouse and dog, to evaluate over-represented gene classes near vector proviruses.