PROJECT SUMMARY/ABSTRACT The ability of our immune system to respond effectively to pathogenic challenge or vaccination depends on a diverse repertoire of Immunoglobulin (Ig) receptors expressed by B lymphocytes. Each B cell receptor (BCR) is unique, having been assembled during lymphocyte development by recombination of germline encoded V(D)J genes. During the course of an immune response, B cells that initially bind antigen with low affinity through their BCR are modified through cycles of somatic hypermutation (SHM) and affinity-dependent selection to produce high-affinity memory and plasma cells. This affinity maturation is a critical component of T cell dependent adaptive immune responses. It helps guard against rapidly mutating pathogens and underlies the basis for many vaccines, but dysregulation can result in autoimmunity and other diseases. Next-generation sequencing (NGS) technologies have revolutionized our ability to carry out large-scale adaptive immune receptor repertoire sequencing (AIRR-Seq) experiments. AIRR-Seq is increasingly being applied to profile BCR repertoires and gain insights into immune responses in healthy individuals and those with a range of diseases, including autoimmunity, infection, allergy, cancer and aging. As NGS technologies improve, these experiments are producing ever larger datasets, with tens- to hundreds-of-millions of BCR sequences. Although promising, repertoire-scale data present fundamental challenges for analysis requiring the development of new techniques and the rethinking of existing methods that are not scalable to the large number of sequences being generated. This proposal describes the development of a series of novel computational methods to explore the central hypothesis that: B cell clonal relationships and lineage structures can be computationally derived from repertoire sequencing data and used to define B cell migration and differentiation networks in health and disease. Specifically, computational methods will be developed to: (Aim 1) identify clonally-related sequences and improve V(D)J gene assignment through determining the Ig locus haplotype, (Aim 2) reconstruct clonal lineages, and use these to learn B cell migration and differentiation networks, and (Aim 3) analyze sequences to predict repertoire properties and sequence motifs that are associated with antigen binding or clinically-relevant outcomes. These through human a combination of simulation-based studies, as (myasthenia gravis) and murine (endogenous methods will be validated well as testing on new experimental data from both retrovirus emergence) systems. Allmethods will be integrated and made available through our widely-used, open-source Immcantation framework, which provides a start-to-finish analytical ecosystem for AIRR-Seq analysis. Together, these methods provide a window into the micro-evolutionary dynamics that drive adaptive immunity and the dysregulation that occurs in disease.