PROJECT SUMMARY The immune system is either directly or indirectly involved in many aspects of human health and disease. However, methods to accurately determine the specific molecular targets of human immune responses are lacking. We have pioneered the use of Phage ImmunoPrecipitation Sequencing (?PhIP-Seq?), which is a massively multiplexed antibody profiling technology involving libraries of bacteriophage-displayed peptides. These peptides are encoded by long, high quality synthetic DNA oligonucleotide libraries. Analysis of PhIP-Seq experiments uses high throughput DNA sequencing. Favorable features of the technology, including sample throughput and per sample cost, uniquely position PhIP-Seq to become an indispensable tool for driving future biomedical discoveries. The types of libraries that can be encoded using synthetic DNA are limited by our current design approach. For example, we have encoded the human proteome and the human virome as ~250K and ~100K peptide libraries, respectively. These libraries can be used to study autoantibody responses or the role of viral infection in complex diseases, for example. Much larger libraries of proteins, however, are inaccessible to encoding due to cost constraints. Aim 1 of this project is devoted to an innovative ?k-mer? based design strategy that will enable representation of more complex protein spaces, such as the collective proteome of the human gut microbiota. PhIP-Seq produces a unique type of data, which cannot be properly analyzed using previously developed or repurposed software. In Aim 2 of this project, we seek to develop methods and software based on modern approaches in statistical sampling theory, including Empirical and Fully Bayesian approaches, for the detection of antibody-peptide binding interactions. In addition, we propose to develop a critical set of experimental annotation standards that will help to ensure that findings associated with PhIP-Seq studies are reproducible. The most commonly employed PhIP-Seq experimental designs involve longitudinal and/or group-wise comparisons. In Aim 3, we propose to develop open source Bioconductor and ?Shiny App? software packages that implement typical analytical pipelines for adaptation by non-programmers to the analysis of their specific experiment. These pipelines will provide epitope-level analyses, and importantly consider antibody cross- reactivity among similar protein sequences. Three PhIP-Seq studies will be performed to illustrate the new design and analysis software tools: a study of type 1 diabetes, a study of inflammatory bowel disease, and a study of Alzheimer?s disease. These resulting data will be made available to the community for re-analysis and data exploration.