The human antibody repertoire is a rich source of diagnostic information and has yielded hundreds of antibody-detecting tests for infectious and autoimmune diseases. Even so, advances in sequencing technology and bioinformatics have created an opportunity to analyze entire human antibody repertoires, to create new and improved diagnostic tests. Despite recognition that antibody repertoires contain substantial untapped information, available tools to analyze and mine repertoires are lacking. The objective of this project is to build and validate a reference database for the archival and analysis of larg datasets that arise from the application of next-generation sequencing (NGS) to peptide display libraries enriched for binding to the entire serum antibody repertoire. Hundreds of specimens from clinically characterized individuals with autoimmune and oncological diseases will be analyzed and the resulting datasets will be collated into a human antibody specificity database. The database will archive the millions of antibody binding peptides for each specimen analyzed, and specimen clinical information. Data will be displayed as customizable histogram plots of the fold enrichment of any peptide motif or pattern across all specimens in the database, or user defined subsets with distinct clinical characteristics. The database will enable de novo discovery of motifs specific to any group of specimens as well as statistical optimization of known degenerate motifs. Motifs exhibiting specificity for particular diseases, will be identified, and experimentally validated. When possible, motifs will be associated with putative environmental organisms and antigens. This project will establish a foundation for development of NGS-based serological tests, as future tool for precision medicine.