Our long-term goal is to investigate the mechanisms cells use to accomplish signal transduction processes through computational modeling and integration of different types of information. Recent advances in large- scale genomic and proteomic techniques have generated enormous amounts of data and have drawn our attention towards a comprehensive understanding of signal transduction at the systems level. Given the incomplete and noisy data generated from these high-throughput techniques, novel computational approaches capable of incorporating multiple data sources for analyzing the genome-wide signal transduction networks are needed to fully take advantage of the rapid accumulation of data. In this study, we will develop robust Bayesian methods to integrate diverse data types for signaling network inference, primarily in the budding yeast Saccharomyces cerevisiae. Our proposed methodology will be applied for identifying protein chaperone complexes and investigating the roles of chaperones in mediating signaling pathways. The results of our method will be subject to further experimental validation. The central hypothesis of the application is that, by developing and applying Bayesian methods on heterogeneous large-scale data sources, we can successfully infer signaling networks in a genome-wide scale. We plan to test the hypothesis and accomplish the overall objective of this application by pursuing the following specific aims: 1) Develop Bayesian methods to identify protein complexes based on large-scale protein interaction data; 2) Discover the associations among proteins and infer signal transduction networks by integrating information from diverse sources, including microarray gene expression data, protein interaction data and protein phosphorylation data; 3) Develop user-friendly computer software to implement the proposed methods. The software will be developed, tested and distributed to the scientific community free of charge. The proposed work will represent the first major effort that extracts information from diverse large-scale datasets through data integration for inferring signal transduction networks at a genome-wide scale. Our proposed approach can be a powerful means to make the process of inferring signal transduction networks faster and easier, and produce hypotheses that guide the experimental design, leading to more informative experiments. The research will contribute to designing efficient signaling network inference methods through integrating heterogeneous data sources. The development of these methods and user-friendly software will provide useful tools to better understand how cells respond to environment changes, and more importantly, how failure of these responses leads to a variety of diseases.