We propose a training program that will prepare an effective independent investigator in computational genomics. The candidate has a PhD in biology from the University of Cambridge and will extend his skills in both computational and wet-lab methods through a two-year program of organized mentorship and training, and a structured five-year research program. This program will promote the command of machine learning as applied to functional genomics data. Dr. William Noble will mentor the candidate's scientific development. Dr. Noble is a recognized leader in computational biology and machine learning. He holds a dual appointment as Associate Professor in Genome Sciences and Com- puter Science and Engineering, and has trained numerous postdoctoral fellows and graduate students. Dr. Jeff Bilmes, Associate Professor of Electrical Engineering, will contribute to the mentoring effort, and a committee of experienced genome and computational biologists will advise on science and the candidate's career goals. Research will focus on the analysis of multiple tracks of data from high-throughput sequencing assays, such as the ChIP-seq data produced by the ENCODE Project. These experiments allow us to obtain a more complete picture of the structure of human chromatin, revealing the behavior of transcription factors, the organization of epigenetic modifications, and the locations of accessible DNA across the entire genome at up to single-base resolution. A current challenge is to discover joint patterns across multiple tracks of these functional genomics results simultaneously. This project will (1) develop computational methods for identifying such patterns, providing new ways of finding both well-understood genomic features and novel functional elements, (2) apply those methods to characterize the similarities and differences among different biological samples, establishing a better understanding of chromatin, the bounds of its variation, and its role in human disease, and (3) validate computational findings with laboratory experiments. The project will use a dynamic Bayesian network (DBN), a type of probabilistic graphical model, to represent the statistical dependencies between observed data, such as sequencing tag density, on an inferred hidden state sequence. The Department of Genome Sciences of the University Of Washington School Of Medicine provides an ideal setting for training a new independent investigator with an extensive program of formal and informal education for postdoctoral scientists, opportunities for collaboration with researchers with expertise in diverse areas, and modern computational and laboratory resources. This environment maximizes the potential for the candidate to obtain the training and perform the research necessary to establish himself as a skilled investigator with an independent research program. PUBLIC HEALTH RELEVANCE: The major outcome of this work will be a trained scientist with the skills to run an independent research pro- gram integrating computational methods and genome biology. Additionally, the research will result in improved methodology and software resources for analyzing functional genomics data, and a better understanding of how chromatin state affects molecular biology and human disease.