A feature frequency analysis program has been written which tabulates features for groups of organisms. The program also tabulates the uniqueness of each feature for each group. A pilot program to search a master list of codable items was written. The user assembles and edits a file containing a subset of interest. The results demonstrated feasibility. After optimization of the data base format, the full system will be written. The traditional algorithms for cluster analysis of microbial data require core storage of two large matrices during program execution. The DEC-10 allows cluster analysis of less than 600 strains at a time. Further, the costs rise exponentially with the number of strains. Modification of the existing program yielded a 90% cost reduction. Pre-sorting strains according to % positive reactions does not change intra-cluster relationships. Mathematical modeling showed that a sorted data set can be partitioned into two overlapping segments, analyzed separately, and the results joined with little loss of meaning. This procedure reduces costs and allows analysis of larger data sets.