BIOINFORMATICS: DEVELOPMENT OF COMPUTATIONAL METHODS FOR ROBUST FUNCTIONAL ANALYSIS Project Leader: O. Troyanskaya, (CompSci/LSI) Our Center provides a highly collaborative environment and necessary computing and experimental Infrastructure for the development and global dissemination of bioinformatics methods that help us to answer diverse systems-level questions ranging from genome-scale evolutionary dynamics to functional annotation to pathway modeling. These methods range from addressing nucleotide variation in the genome, to its functional characterization, to inferring regulatory networks. Subproject 1: Global identification of genome sequence variation (Kruglyak and Botstein) Progress Report: Three years ago, the Botsteln and Kruglyak laboratories developed SNPScanner, a method for comprehensively assessing nudeotide-variation on a global scale. The basis of this approach was to compare hybridization data from two yeast strains with diverged genomes: the reference genome strain S288c and a wild isolate, RM11. High quality sequence information was available for these two strains through the original yeast genome sequencing project [199] and the RM11 sequencing project [200]. This provided -25,000 SNPs with which to train an algorithm that then predicted the presence of SNPs in a genome based on hybridization data obtained from a single microarray. The description of this method and illustration of its applications was reported in 2006 [22]. Increasingly, this unbiased approach to detecting nucleotide variation on a global scale in yeast and bacteria has been adopted by other researchers in both academic and industrial institutions around the worid. Three distinct applications from the Lewis-Sigler Institute for Integrative Genomics illustrating the variety of questions that can be addressed with this technology are described below. Genome-wide analysis of nucleotide-level variation in commonly used Saccharomyces cerevisiae strains [39] We analyzed the commonly used laboratory strains using the tiling arrays and the SNPScanner algorithm [22]. Some strains are mosaics of each other, whereas others cleariy share no recent genetic ancestors; for many kinds of experiments, this makes a very big difference. We have used this information also to design microarrays that have probes in universally nonpolymorphic regions, which has facilitated many kinds of studies with distantly related species, notably in the Junior Project Laboratory experiments using Saccharomyces bayanus (see below). Influence of genotype and nutrition on survival and metabolism of starving yeast [76] The pace of genetics studies is typically limited by the ease of determining of the causative allele. This is greatly exacerbated in cases in which the phenotype is either subtle or time-consuming and laborious to assay. In this study the tiling array-SNP scanner mutation detection system allowed us to identify dozens of genes through mutations whose only phenotype is suppression of a subtle phenotype on a single tiling microarray. The repertoire and dynamics of evolutionary adaptations to controlled nutrient-limited environments in yeast [201] In this study we used the SNPScanner algorithm to comprehensively detect mutations that had arisen in yeast subjected to experimental evolution under a variety of nutrient limiting conditions. This enabled a study of the complete spectrum of mutations associated with adaptation and a retrospective analysis of the allele frequencies during the evolution experiment. This represented the first example of comprehensive mutation identification in an experimentally evolved eukaryotic organism - a problem that only became tractable with the development of SNPScanner.