The broad, long term objective of this proposal is to develop user-friendly, intelligent software that will enable biologists to analyze their multivariate now cytometry data more efficiently and more effectively. Flow Cytometers can now collect eight or more measurements per cell and biochemists and cell biologists have devised markers to take advantage of this multiparameter capability. The software to enable biologists to easily analyze this multivariate data is only now beginning to be developed. This proposal addresses the needs of investigators doing basic cell biology work and clinicians needing rapid and thorough analyses of their multivariate flow cytometry data. Specific Aim 1 is focused on the development of device independent software for interactive, exploratory analysis of multivariate flow cytometry data. The software will use a mouse and menu user interface so that the user will not have to read bulky manuals. The program will include two and three dimensional dynamic color graphics. It will incorporate a statistical expert system called GRAPH HELPER to assist the user in choosing the best set of displays to examine. In Specific Aim 2, an expert system will be developed that incorporates the knowledge of a statistician and that of a biologist to assist in the automatic processing of many datasets of multivariate flow cytometry data. The ANALYST expert system will contain separate knowledge bases for statistical expertise and biological expertise. The statistician knowledge base will include rules and facts about hypothesis testing, cluster analysis, dimensionality reduction, probability modeling and two and three dimensional graphical display techniques. The biologist knowledge base will at first include rules and facts governing human differentiation antigen clusters and the fluorescent monoclonal antibodies used to detect them. It will later include knowledge about other specific biological systems. Specific Aim 3 is focused on the investigation of several approaches to partitioning multivariate flow cytometry data. Parametric clustering schemes using the K-means algorithm and the Mahalanobis distance metric will be investigated along with several nonparametric approaches. Various techniques will be examined for efficiently starting the clustering process and for determining the correct number of clusters. Classification and Regression Tree (CART) algorithms will be investigated for partitioning multivariate flow cytometry data after an initial clustering. Clustering methods will be developed to run on the LANL Connection Machine so that data from large numbers of cells can be included in a cluster analysis during an interactive session with the user.