Viruses are the most abundant, most diverse, and least understood biological entities on Earth. Humans contain at least several trillion viruses, largely phage (bacteriophage) infecting the bacteria in the gastrointestinal tract. Because viruses use their host cells to reproduce, they can impact the host populations. Gut phage possess genes which are integral to the biological functions of both their bacterial hosts and the humans they inhabit. While they potentially affect digestion, pathogenicity and immune function, currently, very little is known about this critical community. The goal of this research project isto develop novel high- throughput methods to rapidly reveal and characterize the diversity of this biological dark matter. In the preliminary study of fecal viral communities from four pairs of twins and their mothers, we showed that 80% of viral genomes have little similarity to known sequences. Viral genes evolve rapidly, making homology- based searches for sequence function difficult. The focus of our research will be to characterize this unknown community using existing techniques, including the elucidation of the phenotype and structure of 100 unknown viral genes. In Aim 1, the genetic and functional diversity of the viral metagenomes will be characterized. Viral genes which are present in multiple people or stable through time will be examined, with a focus on those which are likely to have metabolic functions affecting the host. In Aim 2 proteomics of samples enriched for viral capsid and structural proteins will be used to link those protein sequences with the DNA sequences, and artificial neural networks will be used to search for similar sequences in the metagenomes. Aim 3 focuses on characterizing the function of metabolic genes that may affect the host. The 100 selected sequences will be expressed in E. coli, their effect on phenotype will be assayed with metabolic arrays and mass spectrometry of metabolites, and their 3D crystal structures will be determined in order to search for structural homology rather than sequence homology. The link between the presence of a predicted metabolic gene and human host health traits, such as obesity, will be evaluated. Combined, these approaches will greatly increase our knowledge of the nature of these unexplored viral communities. An understanding of uncharacterized viral gene function in the human gut will not only shed light on the twin- and-mother communities we are studying directly, it will also address questions about the known correlations between human health and gut viruses. This research will likely improve our fundamental understanding of the interactions of viruses, bacteria, human metabolism and immunity. Discoveries related to the progression of auto-immune diseases, metabolic disorders, allergies and many other conditions may become possible.