There is a fundamental gap in understanding the significance of how intra- and inter-personal variation in the structure of the human microbiome affects human phenotypes. Continued existence of this gap is problematic because it impedes the ability to relate this variation with changes in host health. Part of the problem is the over-reliance on microbiome-wide metrics of similarity instead of population-based metrics. This is similar to using microarray technology to compare the overall differences of E. coli gene expression in exponential versus stationary phase without addressing the change in expression of individual genes. Yet, a quantitative framework to aid in the analysis of population data from cross- sectional and longitudinal studies is lacking. The long-term goal is to understand the mechanisms that shape the structure and function of the human microbiome. The objective of this proposal is to develop robust computational tools that are optimized to analyze large sequence collections, yet are accessible to the typical investigator that is not an expert in bioinformatics. Specifically, this proposal will fulfill the stated need to develop computational tools that enable HMP-scientists to determine whether "variation in the microbiome at a site can be related to human phenotypes, such as disease." This proposal will develop robust computational tools that are optimized to analyze large sequence collections, yet are accessible to the typical investigator that is not an expert in bioinformatics. The rationale for this proposal is the imminent release of data from a number of HMP Demonstration Projects that are pursuing cross-sectional and longitudinal sampling, but have realized that they are limited in their ability to identify statistically robust linkages between specific changes in the microbiome with human phenotypes. Building upon extensive previous experience and interactions with HMP investigators, the objective will be achieved by pursuing three specific aims: 1) implement and disseminate computational tools in the mothur software package;2) develop tools to correlate inter- subject variation in the microbiome with variation in health;and 3) develop tools to connect the dynamics of the microbiome with changes in health. Each of the tools developed in the proposed research will be validated using simulated data and evaluated using HMP-generated sequence data. This research is innovative because it builds upon an already strong collection of tools for describing a community's "parts list" within the popular mothur software package and will create a robust set of statistical tools for assessing temporal variation and how that variation is related to health. The proposed research is significant because it will advance our ability to advance the goals of the HMP by relating community and population-level dynamics to changes in human health. PUBLIC HEALTH RELEVANCE: The proposed research is relevant to public health because the generation of tools that allow one to link changes in the microbiome to health will allow scientists to identify microbial populations within the microbiome that are responsible for diseases such as obesity, bacterial vaginosis, irritable bowel disorders, and cancer. Therefore, the proposed research is relevant to the part of the NIH's mission related to fostering innovative research strategies that improve the nation's ability to prevent and treat disease.