The composition of the immunoglobulin repertoire has been studied for a long time, but has been greatly limited by the current protocol of studying rearrangement frequency by PCR, cloning and sequencing of individual rearrangements. Of necessity, therefore, the database from which conclusions were made was very restricted, and in fact, we know very little about the rearrangement pattern of Vh genes throughout the locus in pro-B cells. The recent advent of next generation sequencing (deep sequencing, massively parallel high throughput sequencing) technology provides the unprecedented ability to rapidly obtain millions of sequences in a single experiment. Analysis of the sequences is then done bioinformatically. Using this new technology, we are now in a position to perform deep sequencing of the entire IgH repertoire in pro-B cells, and thus for the first time, to accurately determine the relative usage of each individual V, D and J gene in the initial repertoire. We will optimize the conditions for utilization of next generation sequencing using cDNA and DNA using the Roche 454 Genome Sequencer FLX system. The extent of non-randomness in many aspects of the generation of the primary Ig repertoire can be elucidated with the millions of sequences that we will obtain. Since the C57BL/6 genome is now completely sequenced, we know the precise location of all V genes within the loci, as well as the sequences of all of their flanking DNA including RSSs and promoters. We will determine which Vh genes are overutilized and which are underutilized in this primary repertoire, and importantly, we will use this information to elucidate factors influencing unequal V gene usage, and controlling accessibility for rearrangement. We hypothesize that we may find regions in which groups of neighboring V genes all rearrange at higher or all at lower frequencies than average. If so, the relative location of those V genes in the 3-dimensional structure of the locus during the compaction and looping that takes place during rearrangement could enhance or inhibit rearrangement depending whether the V genes are closer or further away from the base of the loops that are created during locus compaction. Therefore, we will compare rearrangement frequencies to the locations of CTCF/cohesin sites. Alternatively, or in addition, such hot spots and cold spots could be the result of epigenetic regulation. Both of these hypotheses will be explored. ChIP-seq is beginning to be done for transcription factors, and we will obtain some ChIP-seq data for transcription factor and epigenetic modifications in Aim 2. We will compare this global data on transcription factor binding and epigenetic landscape to the global data on Vh gene usage to determine if frequently rearranging Vh genes have certain transcription factors or architectural proteins bound nearby, or certain epigenetic profiles. Through this analysis of binding sites for transcription factors that may influence accessibility and therefore influence rearrangement frequency, we will gain novel insights into the mechanisms controlling accessibility of different portions of the Vh locus to undergo rearrangement. ) PUBLIC HEALTH RELEVANCE: Health relatedness Development of the optimal protocols and the bioinformatic tools to analyze VDJ sequences from high throughput sequencing platforms will be of general use to all investigators interested in any area of repertoire analyses, whether of immunoglobulin or TCR. Importantly, once analyses have been made of the normal repertoire, this technology can be expanded to examine potential perturbations of the repertoire in disease states such autoimmunity (e.g., lupus, rheumatoid arthritis, diabetes), or to follow the fate of certain clonotypes (identified by CDR3 and V gene usage) following immunization or infection with a variety of pathogens/pathogenic antigens. Furthermore, misregulation of V(D)J rearrangement can result in translocations leading to lymphomas and leukemias, and so the data obtained through this next generation sequencing will permit us to more fully understand the tight regulation of the V(D)J recombination process. )