The provenance of several components of major uniquely eukaryotic molecular machines are increasingly being traced back to prokaryotic biological conflict systems. L Aravind and his team demonstrated that the N-terminal single-stranded DNA-binding domain from the anti-restriction protein ArdC, deployed by bacterial mobile elements against their host, was independently acquired twice by eukaryotes, giving rise to the DNA-binding domains of XPC/Rad4 and the Tc-38-like proteins in the stem kinetoplastid. In both instances, the ArdC-N domain tandemly duplicated forming an extensive DNA-binding interface. In XPC/Rad4, the ArdC-N domains (BHDs) also fused to the inactive transglutaminase domain of a peptide-N-glycanase ultimately derived from an archaeal conflict system. Alongside, they delineated several parallel acquisitions from conjugative elements/bacteriophages that gave rise to key components of the kinetoplast DNA (kDNA) replication apparatus. These findings resolve two outstanding questions in eukaryote biology: (1) the origin of the unique DNA lesion-recognition component of NER and (2) origin of the unusual, plasmid-like features of kDNA. The evolution of release factors catalyzing the hydrolysis of the final peptidyl-tRNA bond and the release of the polypeptide from the ribosome has been a longstanding paradox. While the components of the translation apparatus are generally well-conserved across extant life, structurally unrelated release factor peptidyl hydrolases (RF-PHs) emerged in the stems of the bacterial and archaeo-eukaryotic lineages. L. Aravind and his team analyzed the diversification of RF-PH domains within the broader evolutionary framework of the translation apparatus. Thus, they reconstructed the possible state of translation termination in the Last Universal Common Ancestor with possible tRNA-like terminators. Further, evolutionary trajectories of the several auxiliary release factors in ribosome quality control (RQC) and rescue pathways point to multiple independent solutions to this problem and frequent transfers between superkingdoms including the recently characterized ArfT, which is more widely distributed across life than previously appreciated. The eukaryotic RQC system was pieced together from components with disparate provenance, which include the long-sought-after Vms1/ANKZF1 RF-PH of bacterial origin. They also uncover an under-appreciated evolutionary driver of innovation in rescue pathways: effectors deployed in biological conflicts that target the ribosome. At least three rescue pathways (centered on the prfH/RFH, baeRF-1, and C12orf65 RF-PH domains), were likely innovated in response to such conflicts. Numerous, diverse, highly variable defense and offense genetic systems are encoded in most bacterial genomes and are involved in various forms of conflict among competing microbes or their eukaryotic hosts. In collaboration with Dr. Eugene Koonin's group Dr. L. Aravind focused on the offense and self-versus-nonself discrimination systems encoded by archaeal genomes that so far have remained largely uncharacterized and unannotated. Specifically, they analyzed archaeal genomic loci encoding polymorphic and related toxin systems and ribosomally synthesized antimicrobial peptides. Using sensitive methods for sequence comparison and the guilt by association approach, they identified such systems in 141 archaeal genomes. These toxins can be classified into four major groups based on the structure of the components involved in the toxin delivery. The toxin domains are often shared between and within each system. They revisited the halocin families and substantially expand the halocin C8 family, which was identified in diverse archaeal genomes and also certain bacteria. Finally, they employ features of protein sequences and genomic locus organization characteristic of archaeocins and polymorphic toxins to identify candidates for analogous but not necessarily homologous systems among uncharacterized protein families. This work confidently predicts that more than 1,600 archaeal proteins, currently annotated as hypothetical in public databases, are components of conflict and self-versus-nonself discrimination systems. Diverse and highly variable systems involved in biological conflicts and self-versus-nonself discrimination are ubiquitous in bacteria but much less studied in archaea. They performed comprehensive comparative genomic analyses of the archaeal systems that share components with analogous bacterial systems and propose an approach to identify new systems that could be involved in these functions. They predicted polymorphic toxin systems in 141 archaeal genomes and identify new, archaea-specific toxin and immunity protein families. These systems are widely represented in archaea and are predicted to play major roles in interactions between species and in intermicrobial conflicts. This work is expected to stimulate experimental research to advance the understanding of poorly characterized major aspects of archaeal biology. A diverse collection of enzymes comprising the protocatechuate dioxygenases (PCADs) has been characterized in several extradiol aromatic compound degradation pathways. Structural studies have shown a relationship between PCADs and the more broadly-distributed, functionally enigmatic Memo domain linked to several human diseases. To better understand the evolution of this PCAD-Memo protein superfamily, L Aravind and his team explored their structural and functional determinants to establish a unified evolutionary framework, identifying 15 clearly-delineable families, including a previously-underappreciated diversity in five Memo clade families. They placed the superfamily's origin within the greater radiation of the nucleoside phosphorylase/hydrolase-peptide/amidohydrolase fold prior to the last universal common ancestor of all extant organisms. In addition to identifying active-site residues across the superfamily, they described three distinct, structurally-variable regions emanating from the core scaffold often housing conserved residues specific to individual families. These were predicted to contribute to the active-site pocket, potentially in substrate specificity and allosteric regulation. They also identified several previously-undescribed conserved genome contexts, providing insight into potentially novel substrates in PCAD clade families. They extended known conserved contextual associations for the Memo clade beyond previously-described associations with the AMMECR1 domain and a radical S-adenosylmethionine family domain. These observations point to two distinct yet potentially overlapping contexts wherein the elusive molecular function of the Memo domain could be finally resolved, thereby linking it to nucleotide base and aliphatic isoprenoid modification. In total, this report throws light on the functions of large swaths of the experimentally-uncharacterized PCAD-Memo families. mRNAs are regulated by nucleotide modifications that influence their cellular fate. Two of the most abundant modified nucleotides are N6-methyladenosine (m6A), found within mRNAs, and N6,2'-O-dimethyladenosine (m6Am), which is found at the first transcribed nucleotide. Distinguishing these modifications in mapping studies has been difficult. L. Aravind identified PCIF1 as the methyltransferase that catalyzes that modification. With Dr. Eric Greer's group at Harvard University he worked to biochemically characterize PCIF1 that generates m6Am. They founnd that PCIF1 binds and is dependent on the m7G cap. By depleting PCIF1, we generated transcriptome-wide maps that distinguish m6Am and m6A. They showed that m6A and m6Am misannotations arise from mRNA isoforms with alternative transcription start sites (TSSs). This explains the biological significance of this RNA modification