The proposed Clinical Cytometry Analysis Software Project described in this grant application is designed to create a new, more efficient and effective way of analyzing cells for the presence of cancer, HIV/AIDS and other disease, using a fully automated software system. Using modern data mining techniques (pattern recognition, feature recognition, image analysis) we will design software which will analyze data (the cell samples from patients) at a much faster rate and with fewer false positives and negatives than the manual method now in use. Objectives: Assemble and validate algorithms in software that can automatically classify regions of interest in flow cytometry data. We will demonstrate that the particular populations required by our use cases can be validly, rigorously and repeatably identified automatically. Develop and validate graphical and statistical results that satisfy FDA requirements for medical device software, simplify regulatory compliance by the clinical user, and automatically deliver analysis results to diagnostic expert systems and/or LIMS systems. Satisfy the translational medicine goals outlined in the NIH Roadmap. This software will bring the clinician streamlined testing currently only available in research labs. Methods: Four use cases have been selected, one employing synthetic data and three clinical data; Leukemia/Lymphoma test, Analysis of longitudinal Graft vs. Host Disease (GvHD) in bone marrow transplant specimens for predictive markers and HIV/AIDS - Gag-specific T cell cytokine response profile assay. For each we have access to a substantial body of existing data, analyzed by experts. Beginning with the autogating routines in our own FlowJo software, we will test and expand the application of Magnetic gating, Probability Clustering, Subtractive Cluster Analysis, Artificial Neural and Support Vector Machines (SVM). Using a sampling of human operators to establish a control range, we will test each of these five techniques against the four use cases in cooperation with our collaborators. Events in the manually classified samples are given a weighted score based on the frequency with which they are included by all the operators. A single operator's score or the gating algorithm's score is compared with the cumulative score of the expert group and a match rating is computed. Additional validation techniques include combinatory validation on internal measures with respect to Pareto optimality, and Predictive Power/Stability self consistency checks using resampled or perturbed data measured with external indices such as the adjusted Rand index and the Variation of Information index. PUBLIC HEALTH RELEVANCE: By eliminating the operator's time, we estimate that the cost of clinical flow cytometry analysis can be reduced to half the current figure while delivering the results much faster. By eliminating the subjectivity and human error of manually created regions and reducing the range of variability of the so created, there would result fewer false positives and false negatives, improving the clinical outcome for those patients needing therapy but undetected by current methods. An order of magnitude increase in speed means faster therapeutic intervention. A less expensive test improves outcome by making the test accessible to more patients. [unreadable] [unreadable] [unreadable]