The long term objectives of this research are to detect and interpret significant patterns of nonrandomness in sets of DNA nucleotide sequences and protein amino acid sequences. The patterns will be sought within single sequences and in common in related coding and/or noncoding sequences of gene families in the same species or the same gene in several species. Significance will be evaluated with probability theory where the system is tractable and by random permutation sampling in intractable cases. Detected patterns will be used to construct rigorous alignments of similar sequences. In turn these alignments will be used to identify patterns in nucleotide substitutions and their correlation with DNA structure and function, and to find conserved "consensus" sequences relevant to the control of duplication, transcription or translation of DNA and to the function of proteins determined by them and to construct phylogenetic trees. Since DNA determines the function of the whole organism as conditioned by the environment, a better understanding of its structure and function can provide clues to the treatment of genetic disease, of genetically conditioned predisposition to disease and of diseases such as cancer suspected to originate in DNA misfunction.