cDNA microarrays make it now possible to measure rapidly and efficiently the levels of transcript for virtually all genes expressed in a biological sample. cDNA microarrays have become a popular and widely used tool. However, the process of transforming this data into meaningful biological insight is impeded by the level of noise in the image from which the data is extracted and by the complexity and vastness of the data. We propose to develop a toolkit S+cDNA on the S-PLUS platform for analyzing gene expression data that offers a solution to both problems. We will implement a novel statistical model which takes explicit account of the major non-biological sources of variability in the measurements. We will also develop software for gene-shaving - a newly developed algorithm specifically designed for the analysis cDNA array data that identifies groups of genes with similar behavior. Combined with the other functionalities already available in S-PLUS, S+cDNA will provide a powerful and invaluable toolkit for analyzing gene expression data. PROPOSED COMMERCIAL APPLICATIONS: Implementation of S+cDNA in an add-on module on the S-PLUS platform will be a powerful and invaluable toolkit for analyzing gene expression data. As such it will produce revenue streams in all areas of biomedical research, these include universities, research institutions, as well as pharmaceutical and biotech companies.