DESCRIPTION: This is an application to use several cryptographic methods to decipher the information contained in the nucleotide sequences of the human genome. The specific aims of the proposal are to: (1) define the statement syntax of the language of gene regulation; (2) compile a dictionary for the genomic language; and (3) compile a thesaurus for the genomic language. Words are not used randomly in a language. Nonrandom word use is used by cryptographers and linguistics to break codes or understand dead languages by defining the lexicon of the code or the language. This application will define the nucleotide sequences that are important for transcription and translation of genes by assembling a lexicon of nucleotide sequences that are over- or under- represented in the genome. The application will define a thesaurus for the genomic language with traffic analysis. A thesaurus is developed to provide information about the context and relationships among words and their meanings. Once important nucleotide sequences are added to the genome lexicon, this research will use traffic analysis to infer meaning of nucleotide sequences from the context of how the sequences are used in different genes. This analysis relies on a high degree of redundancy and conservation in the use of nucleotide sequences. The cryptographic analysis will be applied to seven different viral genomes, Ad2, SV40, HPV18, MVM, EBV, HSV1, and CMV. The genes within the viral genomes fall into four classes based on their time of gene expression. The use of nucleotide sequences within the four time classes will be analyzed for common sequence motifs that help to define the lexicon and thesaurus used by the viral genomes. Two specific questions will be addressed under each specific aim of the proposal: (1) What is the chronological order in which sequence enhancers function throughout the viral lytic cycle which would define a chronological order of gene expression? and (2) Given a generic statement syntax for gene expression, how does that syntax vary as genes are expressed at different times during a given viral lytic cycle?