Gene expression measurement using cDNA and oligo arrays continues to be a popular and useful technology for genomic analysis. High throughput methods for measuring protein concentrations are also increasing in popularity. One of the more challenging problems results from the large volume of data generated in these experiments. Image capture, processing, interpretation and quantification remain important fundamental issues. Quality control and experimental design must be carefully addressed. Many problematic statistical, image processing and bioinformatics issues remain and are addressed in this project.[unreadable] [unreadable] Affordable, high-quality software availability has been one of the bottlenecks in analysis of microarray data. We have continued development of the "MSCL Analyst's Toolbox" for such cases. Built upon the NIH-licensed commercial statistical package JMP, this toolbox allows investigators to download Affymetrix microarray data from a central database, normalize and transform the data, inspect it for a variety of outliers or defects, perform a variety of statistical tests to select relevant genes affected in the experiment, and then visualize and classify various patterns of gene expression. Because the Toolbox is written in open source scripts, its statistical tests can be modified as needed to conform to novel or unique experimental designs. In collaboration with over thirty investigators in CC, NHLBI, NIDCR and other ICs, this tool has been applied to several dozen microarray studies. One-day and two-day Toolbox training workshops were presented on the NIH campus. A 2-day JMP scripting class was sponsored for NIH users.[unreadable] [unreadable] [unreadable] In a major NIH-wide project, we maintain a database for storage, retrieval and analysis of Affymetrix oligo-based microarrays, NIHLIMS. As part of this collaboration, we are creating a data analysis pipeline and bioinformatics toolset, including both commercial and freely available software. The database currently stores information from over 2000 microarrays. Our downloadable tool set (MSCL Analyst's Toolbox) are now mature, widely tested and applied in numerous studies. Working with investigators in NCI, CC, NHLBI, NINDS, NIAID, NHGRI, NICHD, NIA, NIDDK, NIDA we have developed, customized and applied this software for the analysis of microarray based studies. We also maintain a quarterly-updated set of annotation files for use with Affymetrix data, in a format for convenient download and use by our collaborators.[unreadable] [unreadable] For several years, our group has functioned as the "analysis core" for a high-volume microarray laboratory in CCMD/CC. All microarray studies by this group now pass through our analysis pipeline. Recently, we became the analysis core for the microarray core facility of NHLBI, roughly doubling the throughput of microarray studies into our database and pipeline. [unreadable] [unreadable] In a series of studies with investigators in NIDCR, we are currently analyzing gene expression in human monocytes before and after differentiation into dendritic cells, under stimulation by lipopolysaccharides derived from bacteria of interest to dental research. The goal is to reveal any basic differences in host response to different organisms, which may be useful diagnostically or therapeutically.[unreadable] [unreadable] In an ongoing proteomics initiatives, we collaborated with investigators in CCMD/CC in analysis of mass-spectroscopy ( SELDI) data to identify potential biomarkers of biological processes, diseases or syndromes. In one study of AIDS patients, a potential biomarker of pneumocystis (PC) pnuemonia was identified in spectra from an analysis of bronchial aveolar lavage fluid samples with SELDI. In an independent follow-up study the biomarker successfully predicted PC status with about 85% accuracy. We have improved our software for applying statistical tests to entire sets of mass-spectra, to objectively determine which peaks yield discriminatory information and have recently incorporated modern statistical algorithms, including Random Forests, to ascertain whether combinations of peaks could improve diagnostic discrimination, over using single peaks.