Adeno-associated viruses (AAVs) are naturally defective parvoviruses that are being developed as delivery systems (vectors) for gene therapies to treat diabetes and many other chronic diseases, including obesity, congenital blindness, cystic fibrosis, hemophilia B, and glycogen storage diseases. Adeno-associated virus type 2 (AAV2) preferentially integrates its DNA at a 4 kilobase (kb)region of human chromosome 19, referred to as AAVS1. The preferential integration of AAV2 DNA into the chromosome 19 locus shares some features with reproducible locus to locus chromosomal translocations that are believed to be involved in the progression of some forms of cancer. Site-specific integration at AAVS1 requires the AAV2 replication (Rep) proteins and specific sequences within AAV2 and AAVS1. AAVS1 contains a 16 base pair (bp) Rep binding site (RBS) and closely spaced Rep nicking site (also referred to as a terminal resolution site, or trs). A short (33 bp) AAVS1 sequence that includes the RBS and trs is sufficient to target AAV2 integration into an episome in the presence of Rep proteins. Most AAV2-AAVS1 junctions map to the 145 base inverted terminal repeats (ITRs) present on each end of the 4.7 Kb, single-stranded DNA, AAV2 genome. The ITRs are necessary for AAV2 replication and packaging. Each ITR also contains an RBS, but only the 3'ITR contains a trs nicking site. Both elements are required for replication and packaging. AAV2 has a relatively low frequency of integration. This is probably due to the fact that, unlike retroviruses, integration is not an obligatory part of the AAV2 life cycle. We and others have noted that the majority of AAV2/AAVS1 junctions occur at short regions of homology between AAV2 and AAVS1. Others have also shown that single-stranded, packaged recombinant AAV2 genomes containing long stretches (>600 bases) of homology with a target host gene, have an ability to correct host gene point mutations that is greater than for the same genomes introduced as double-stranded plasmids. The mechanism of this correction is presumed to involve homologous recombination. We therefore hypothesized that increasing sequence homology between AAV2 and AAVS1 might increase either the frequency or site-specificity of AAV2 integration. One approach was to insert DNA sequences from AAVS1 into the wild-type AAV2 genome. The amount of sequence that can be added is limited by the packaging capacity of AAV2, we therefore started with a modest insert of 49 bases. Our second approach was to expand an existing region of homology by replacing the RBS/nicking site region of the AAV2 ITRs with the corresponding region from AAVS1. This second strategy does not increase the size of the AAV2 genome, but there was concern that the AAVS1 sequence might not contain all of the sequence elements required for AAV2 replication, packaging and integration. Although we did not detect an increase in integration specificity, our results do indicate that the RBS and nicking site elements from AAVS1 and AAV2 are functionally interchangeable. Sequencing of integration junctions showed the joining of the modified ITRs to AAVS1 sequences. We observed comparable amounts of DNase-resistant (packaged) AAV2 genomes with wild-type ITRs or with both ITRs substituted with AAVS1 sequences. The size range of nuclease-resistant virus DNA was similar for the modified or wild-type AAV2. To our knowledge, this is the first time that host DNA sequences, other than something that was clearly an integrated provirus, has been able to substitute for a viral origin of replication. A fundamental question in virology is centered on the origins of virus DNA sequences. The RRS/trs combination at the MBS85 gene (the AAVS1 locus in humans) has also been detected in mice and African green monkeys. Although it cannot be formally ruled out that this sequence is the remnant of an AAV2 integration event that occurred prior to the rodent-primate evolutionary divergence, a more intriguing possibility is that the AAV2 origin of replication is derived from this genomic sequence. These results also suggest a level of sequence flexibility that could promote rapid evolutionary divergence of AAVs. Our work also provides the first positive evidence for a packaging signal at the AAV2 nicking site. Work from our group and others indicates that the nicking site is within a stable secondary structure that can only form after the two DNA strands are separated. This strand separation is believed to be driven by the site-specific helicase activity of the AAV2 Rep proteins which targets the nearby Rep Binding site (RBS). Another lab has shown that packaging is greatly impaired (over 50-fold) when the 3'end, but not the 5'end, of the single-stranded genome is missing 18 bases from the inboard end of the ITR. Based on previous work by our group and others, the deleted sequence would lack the ability to form the stable secondary structure at the nicking site. The 11 base sequence from AAVS1 (chromosome 19), which essentially replaces the 18 bases deleted by the previously mentioned group, in our mutated ITR has only has 2, non-adjacent, bases of sequence identity with the wild-type AAV2 sequence. It is therefore a reasonable inference that the stable secondary structure, the only other known commonality between the two sequences, is part of the packaging signal.