The purpose of this project is the application of information theory to basic and clinical research on the relationships between DNA and protein sequences. We have collaborated with Dr. T. Schneider of NCI, et al., to perform DNA splice site analyses. This includes collaborative development of processing algorithms for the information content of macromolecular sequences. It also involves communication of data, processing methods, and results among researchers in diverse fields.[unreadable] [unreadable] An effort to analyze mutations associated with Duchenne muscular dystrophy sequences continues with Dr. Therese Tuohy of the Department of Human Genetics of the University of Utah.[unreadable] [unreadable] Splice site analyses of some splice variants in the ASPM gene were done for Dr. Vladimir Larionov (NCI/CCR) with Dr. Barry Zeeberg (NCI/CCR). This is a gene with two well-accepted alleles, and possibly a fairly large number of others that have not been reported. Analysis showed that splicing events almost always occurred at sites predicted to be strong to moderate by our local analysis. This gene is rich in potential sites, so determining the sites of likely splicing events is quite within our capabilities. However, choosing most likely sets of exons and calculating probabilities of alleles has been much more difficult. We are trying to determine the probabilities using slightly more global constraints, as discussed in the companion report, "Information Analysis of DNA, RNA, and Protein Sequences."[unreadable] [unreadable] We were asked to analyze a DNA sequence suspected to be involved in splicing with an exonic mutation of the SLC26A4 gene. According to the standard di-nucleotide models this should have no effect on splicing. The specified mutation had no effect on splicing according to our model either. However, the sequence supplied differed slightly from the Genbank standard for this gene in the preceding intron. This difference amounted to a mutation that could change a strong acceptor site into a complicated combination of competing acceptors. If this turns out to be true, it may be an excellent example of the problem of assuming that the mutation(s) of importance are in the exons, which are regarded as directly affecting generated proteins, while in fact the more important mutations affecting splicing are in the introns.