Core B,the Microbiome Data ManagementCore (MDMC), will serve as a central data repository for Projects 1, 2 and 3 as well as Core A. This facility will be housed at TIGR and directed by Jeremy Selengut of TIGR's Bioinformatics Department. The diverse data types and large amount of data that will be generated by the different components of this program necessitate a central facility for data storage and access. This typeof program, with a distributed sequencing effort, a shared pool of data for collaborative analysis, and a common database to flexibly represent not only the data but the growing conceptual model of the system under study, is likely to be an archetype of future human microbiome and other environmental metagenomics research. The plan we outline utilizes technologies and methods with which we are very experienced and skilled, but are combined and shaped into a new system optimized for this type of programmatic effort. There are 3 aims. Aim 1 Establish a Core Database. We will establish a database repository to support the sample collection effort and metagenomic/pan-genomic, analyses for the microbiome project. This system will support types of data critical to the success of this program project:e.g., subsets of de-identified patient metadata, 16SrRNA- pan-genomic- and fecal community microbiome gene, transcriptome and metabolite datasets). We will provision for multiple users and institutions to operate on the database, and develop straightforward electronic submission and retrieval mechanisms for the MDMC database. Aim 2: Formalize Data Exchange. The core will ensure that all electronic data are robustly encoded in a data exchange file format to effectively support the project. We will supply an Application Programmer Interface (API)that will allow all contributors to deliver data to MDMC over the web. We will support the API with documentation. Open Source code, training support, and validation scripts for all data required by the project. Aim 3: Maintain Communication Between the MDMC and Projects 1, 2 and 3. Ensuring mat the data management system meets the needs of the scientists distributed among the other projects in this proposal is paramount. This will be accomplished by development of documentation, a help desk system, direct contacts between key staff, a system of email updates, reports to the project website, and attendance at all regularly scheduled meetings.