Progress in understanding the organization and functioning of the human genome has been blocked by the lack of functional assays of sequences that comprise relatively massive amounts of the nuclear DNA. A long term objective of these studies is to reveal the function and effects on gene expression in normal and malignant human cells of the long, interspersed, and transcriptionally highly active KpnI sequences and of the highly repeated alphoid DNA sequences. Approaches and methods have now been developed for such studies. The KpnI family appears to be a vast family of pseudogenes or reverse transcripts of a polymerase II gene that encodes polypeptides and has been conserved in vertebrate evolution. Thousands of copies are present in the human genome, often at strategic locations near human genes and human endogenous retroviruses. The 6 kilobase consensus sequence contains long open reading frames capable of encoding hundreds of amino acids. To identify the apparently important protein product(s), three of these reading frames will be inserted in E. coli high level gene expression vectors. The purified polypeptides will be used to raise polyclonal and monoclonal antibodies which can then be used to identify, localize and characterize the product of this family in normal and cancerous cells. Highly repeated alphoid sequences of the human and the primate genomes are capable of suppressing gene expression vectors to which they are linked. The suppression is dependent upon repeat length, orientation and position with respect to the gene coding sequences. The molecular mechanisms accounting for this suppression will be elucidated by studies of RNA transcription and processing, vector topology, methylation, and intranuclear chromatin packaging of gene expression constructs. Among the variables to be analyzed are two strong eukaryotic promoter-enhancers, (SV40 and HTLV-1), different repeated sequences (alphoid DNAs, satellite DNAs, and KpnI families) and different genes (CAT and beta-globin). The hypothesis that the function of highly repeated sequences is the suppression of homologous recombination will be tested using plasmid constructs containing highly repeated sequences linked to deletion mutant genes conferring antibiotic resistance to mammalian cells. A methodology has been devised such that the effects of the repeated sequences on recombinant events can be detected and scored before and after chromosomal integration in mammalian cells. (I)