INTRODUCTION - Mammalian L1 (LINE-1) elements replicate (retrotranspose) by copying their RNA transcripts into genomic DNA. The ~6 kb human L1 element has four regions: The 5?untranslated region (UTR) has a regulatory function; open reading frame (ORF) 1 encodes an RNA binding protein with nucleic acid chaperone activity; ORF2 encodes a DNA endonuclease and reverse transcriptase; the 3?UTR contains a conserved G-rich polypurine motif. L1 replication produces mostly (~2/3) defective L1 copies, usually 5? truncated, that are retained in the genome and evolve as pseudogenes. L1 can also retropose the RNA transcripts of short interspersed repeated DNA (SINEs), & the processed transcripts from nuclear genes. L1 activity has persisted in mammals since before their radiation 100 Myr ago and has generated ~40% of the mass of all the mammalian genomes examined to date. L1 and SINEs sequences can recombine & thereby cause genetic rearrangements, and upon insertion can inactivate genes or alter their regulation. Although L1 activity clearly has had a profound effect on the structure and function of modern genomes, little is known about the regulation of L1 activity, to what extent L1 and its host interact, or even whether L1 activity affects the fitness of its host.[unreadable] RECENT FINDINGS: THE CURRENTLY ACTIVE HUMAN L1 HAS REDUCED THE GENETIC FITNESS OF HUMANS ?We earlier showed that the human-specific L1Hs family (also called L1Pa1, or Ta) evolved from the ancestral L1Pa2 family soon after the human / chimpanzee divergence, about 6 Myr ago. Then, Ta gave rise to the Ta1 subfamily ~1 Myr ago, which since then has been the dominant L1 family in the human lineage. Ta1-generated insertions can cause genetic defects in current humans. However, these are relatively rare and it was not known whether Ta1 inserts, or any other property of Ta1 elements, were deleterious enough to reduce the fitness of humans. Natural selection against a segregating allele provides near irrefutable evidence that it is deleterious. We found that negative selection has affected full-length Ta1 elements, but not either truncated ones or the Ta1 generated SINE (Alu) insertions. Thus, one or more properties unique to full-length Ta1 L1 elements constitute a genetic burden for modern humans. The full-length Ta1 elements became more deleterious as the expansion of Ta1 has proceeded. As this expansion is ongoing, the Ta1 subfamily almost certainly continues to decrease the fitness of modern humans.[unreadable] HUMAN POPULATION GENETIC STRUCTURE AND DIVERSITY INFERRED FROM POLYMORPHIC L1 INSERTIONS - As the human genome database is built on a small number of individuals, it could not represent a complete census of Ta1 insertions. Therefore, we cloned the Ta1-containing loci from four ethnic groups: African pygmy, Caucasian Druze, Chinese, Melanesian. We recovered 90% of the ~260 (haploid) Ta1 inserts in each the four individuals; 40% of these are not present in the human genome database and of these 93% are polymorphic, compared to 51% for the database. Thus, we could add enough novel polymorphic L1 alleles to those already available from the human genome database to robustly analyze human population structure and origins using just L1 insertions. We supplied the new L1-containing alleles to the Batzer & Jorde groups who carried out the population analysis on 317 individuals from 21 populations in sub-Saharan Africa, East Asia, Europe and the Indian subcontinent. They analyzed these data in parallel with a set of 100 polymorphic Alu insertion loci previously genotyped in the same individuals. The data sets yield congruent results that support the recent African origin model of human ancestry. A genetic clustering algorithm detects clusters of individuals corresponding to continental regions. The number of loci sampled is critical: with fewer than 50 typical loci, structure cannot be reliably discerned in these populations. The inclusion of geographically intermediate populations (from India) reduces the distinctness of clustering. The results indicate that human genetic variation is neither perfectly correlated with geographic distance (i.e., purely clinal) nor independent of distance (purely clustered), but a combination of both; i.e., stepped clinal. THE FUNCTIONAL AND STRUCTURAL CONSEQUENCES OF THE EVOLUTIONARY CHANGES IN ORF1P - We are correlating the effects of evolutionary changes in ORF1p with: (A) its function in a cell culture retrotransposition assay; (B) its biochemical properties, i.e., RNA-binding, multimer formation, nucleic acid chaperone activity, each of which is essential for retrotransposition. To do this we constructed an ancestral version of ORF1p (L1Pa5) that predated the evolutionary changes of the modern version (L1Pa1) and mosaic ORF1p?s containing modern and ancestral regions. As the structure of these proteins will be of considerable interest, especially because ORF1p does not belong to any known protein family, we are collaborating with Fred Dyda and Allison Hickman, structural biologists in NIDDK to determine their structures. INTERACTION BETWEEN L1 AND ITS HOST: (a) RNA transport factors - Others showed that the nuclear exchange factor (NXF1, or Tap1), a host protein that mediates nuclear export of non-spliced RNAs, as would be the case for L1 and retroviral RNAs, binds to L1 RNA. The dominance of L1 elements over all other retrotransposons could result from L1 pre-empting NXF1. Thus, we are now trying to confirm and extend these important and provocative findings. (b) L1 ORF1 protein (ORF1p) ? We previously showed that the coiled coil motif of ORF1p underwent episodes of adaptive evolution early in hominid evolution. Adaptive evolution often implies an interacting system (e.g., a virus & its host). Therefore, we used ORF1p as bait in a yeast two-hybrid screen for interacting mammalian proteins. We selected nine proteins from our initial screen for further study. Seven did not interact with the ancestral (i.e., pre-adapted) version of the ORF1p coiled-coil domain. Seven of the nine proteins either contain RNA-binding motifs or are known to bind RNA, the primary substrate for L1 mediated retrotransposition. We are now confirming these results with a variety of in vivo & in vitro techniques.