One of the most common ways of studying a vertebrate immune system is to statistically compare populations of antigen receptors (either immunoglobulins or T-cell receptors) derived from different tissues under various experimental/clinical conditions (for instance, when monitoring chemotherapy patients). The problem is difficult and does not fit readily in the standard statistical frameworks due to (i) extremely diverse antigen receptor repertoires maintained by the immune system and (ii) technological limitations on data collection. In particular, when applied to antigen receptor studies, the traditional statistical methods of species richness and diversity inference, as e.g., ones used in ecology, often seriously underreport the true richness and diversity of TCR repertoires. This contributes to the relatively poor understanding of such repertoires' biological traits, despite great advances of modern molecular technology in TCR data collection. The proposed research project is an interdisciplinary undertaking by a team of researchers with backgrounds in applied mathematics, statistics, bioinformatics, and experimental immunology. The project's goal is to (i) systematically review the existing statistical methods for analyzing antigen receptor data and (ii) propose new, more efficient ones. In broad terms, the antigen receptor dataset may be characterized as a k-way table of n observations, with multiple cells of low counts and with a total number of cells (population richness) unknown. To analyze such tables, we propose to develop a comprehensive approach applicable to data obtained from the standard biological assays, like flow cytometry, spectratyping and DNA sequencing, under the hierarchical multinomial and Poisson models for counts data. The new proposed methods will be evaluated vis-a-vis traditional ones using the simulations as well as the data from cancer studies in TCR-min mice which have specially limited TCR repertoire. The statistical methodology derived and deemed most successful will be implemented in the public domain software to be made available at CRAN and caBIG archives. PUBLIC HEALTH RELEVANCE: The proposed research will develop analytical and computational tools for analyzing antigen receptor data. Proper analysis of such data is one of the fundamental issues in studying vertebrate immune responses and therefore, the methods developed in this proposal will have broad applications to immunological studies in general. The tools developed in this proposal will help us to understand better the nature and various functions of T-cells which will lead to the development of more effective approaches to immunotherapy and cancer treatment.