Only 3% of human genome is expected to code for proteins. The remaining 97% has been called "junk DNA". As highlighted in research news articles, several recent results indicated that there may be hidden treasure in this "junk". The highlighted findings include the discovery of regulatory signals in minisatellites, novel 3' untranslated region (3'UTR) RNA binding motifs is C-elegant Lin-14 genes and novel RNA secondary structure regulatory motifs in 5'UTRs of tRNA synthetase genes of gram-positive bacteria. The novelty of these findings leave open two important questions. Are these "grotesque deviants" or "first emissaries"? And if they are emissaries, how can their world be discovered? If these recent findings are emissaries, then their identification and characterization ill have a major impact on the next phase of the human genome project, and the numerous health benefits expected to be derived from this project. We recently described a novel method for the detection of subtle sequence signals and its application to protein sequence alignment (Lawrence, et al., 1993). The strength of this method rests on the sampling models based on the physicochemical characteristics of macromolecules and complexes. We have previously applied the predecessor of this method to the identification of gene regulation elements, but we have only begun to exploit its full potential. The main coal of this research is to adapt these methods for the identification and characterization of novel sequence signals in the non-coding regions of genomes. Specifically, we plan to adapt these methods through three developments: l) sampling models that focus on characteristics of DNA interactions in complex contexts; 2) sampling models based on the energetics of RNA/RNA interaction; and 3) a set of universally applicable enhancements. To achieve these ends, we will develop and distribute a software system for the identification and characterization of subtle sequence signals in non-coding regions of genomes.