Comparative analysis of the amino acid sequences of the initiator proteins and of the nucleotide sequences of the replication origins for rolling circle DNA replication (RCR) was performed in order to predict unidentified functional sites including amino acid residue covalently linking to DNA and to assess the relationships between replicons exploiting RCR. RCR is one of the basic mechanisms of DNA replication exploited by numerous small single-stranded (ss)and double-stranded DNA replicons from eubacteria and archaebacteria, and by at least two groups of small eukaryotic viruses, parvorisuses and geminiviruses. Detailed comparisons of the amino acid sequences of RCR initiator proteins led to the identification of a motif that consists of the sequence HisHydrHisHydrHydrHydr (Hydr-bulky hydrophobic residue) and is conserved in two vast classes of proteins, one of which is involved in RCR proper (Rep proteins), and the other in mobilization (conjugal transfer)of plasmid DNA (Mob proteins). Based on analogies with metalloenzymes, it is hypothesized that the two conserved His residues in this motif may be involved in metal ion coordination required for the activity of the Rep and Mob proteins. Rep proteins contained two additional conserved motifs, one of which was located upstream, and the other downstream from the "two His" motif. The C-terminal motif encompassed the Tyr residue(s) forming the covalent link with nicked DNA. Mob proteins were characterized by the opposite orientation of the conserved motifs, with the (putative) DNA-linking Tyr being located near their N-termini. Both Rep and Mob protein classes further split into several distinct families. Although it was not possible to find a motif or pattern that would be unique for the entire Rep or Mob class, unique patterns were derived for large subsets of the proteins of each class. These observations allowed the prediction of the amino acid residues involved in DNA nicking, which is required for the initiation of RCR or conjugal transfer of -stranded ssDNA, in Rep and Mob proteins encoded by a number of replicons of highly diverse size, structure and origin. It is conjectured that recombination has played a major part in the dissemination of genes encoding related Rep or Mob proteins among the replicons exploiting RCR. It is speculated that the eukaryotic small ssDNA replicons encoding proteins with the conserved RCR motifs & replicating via RCR-related mechanisms, i.e. geminiviruses & parvoviruses, may have evolved from eubacterial replicons. The project's significance lies in prediction of the functional sites in a number of widely studied proteins and the demonstration of apparent evolutionary links between a number of diverse eubacterial, archaebacterial, & eukaryotic replicons. Experiments designed to test the predictions of the functional sites in the parvovirus replication initiation proteins are in progress in D. Tattersall's lab.