1. Apply experimental genomics approaches to measure global responses before and after perturbations and use (and develop if necessary) computational approaches to process and integrate such data to systematically infer networks. Our initial focus is on transcripomic phenotypes including that of microRNAs. We are using a combination of RNAseq, microarray and Fluidigm the former can provide detailed transcriptomic information including the abundance of non-coding genes, alternative splicing isoforms and rare transcripts; microarrays can be applied to a larger number of conditions/samples because they are less expensive; Fluidigm offers inexpensive profiling of a large number of samples by focusing on a carefully selected panel of transcripts. Some example efforts include: a. In collaboration with the Germain and Fraser labs, we have assayed the transcriptomic and microRNA responses of BMDMs in response to dose combinations of TLR ligands. We are particularly interested in integrating this with signaling level data to infer the connection between signaling events to transcriptional regulation. b. In another project we stimulated RAW cells with LPS (a prototypical TLR activator) and generated RNAseq data with deep coverage (160 million reads per sample). Using this data, we have developed computational pipelines for processing and analyzing RNAseq data and have examined basal and LPS-stimulated phenotypes at the splice isoform (both conserved and non-conserved), intergenic, and rare transcript levels. In addition to gene expression, we are also in the process of developing experimental and computational approaches to assay chromatin states, including transcription factor binding data. During this past year by utilizing the deep coverage provided by RNAseq, we have developed computational approaches to characterize the extent of non-conserved unexpected splicing events (USE), many of which have postulated to be results of noisy splicing. We found that USEs are prevalent across macrophages and T cells in both resting and stimulated conditions (in collaboration with the Jun Zhu lab on the T cell data). The extent of USE is highly variable across genes, with certain pathway exhibiting significantly different levels of USEs. We further tested our approach using public data sets obtained from multiple human cell lines and observed similar trends. We also observed that that some USE events are conserved across conditions while some are condition specific. We are in the process of utilizing USE as a means to infer factors and regulatory networks that regulate alternative splicing in cells. c. We have generated RNAseq data on both coding and non-coding RNAs across diverse macrophage activation conditions. These data are being used to understand the transcriptomic landscape and splicing repertoire of macrophage states; they are also being integrated with other data types, such as ChIP-Seq. 2. Develop single-cell gene expression assays to measure the transcriptome in individual cells before and after perturbations. We have been developing two complementary approaches. In the first we combine flow cytometry with single-cell based PCR (Fluidigm) to assay the expression of 100 genes/proteins. Using human blood derived macrophage as a model, we have generated data on hundreds of single cells from different stimulation conditions. We are particularly interested in assaying expression heterogeneity under different stimulations and time-points because we would like to compare environmentally induced differences both at the cell population and individual cell levels in macrophage activation. Example questions include: How heterogeneous are the responses? Is the level of heterogeneity different across stimulations? Can transcriptomic heterogeneity be used to inform the underlying network architecture? In addition to single cells, we are generating data at the bulk as well as tens-of-cells levels to complement and augment single cell data. To utilize Fluidigm, we have developed a strategy to design informative, unique marker panels based on multiple public gene expression data sets derived from myeloid as well as other hematopoietic cells. As a complementary approach, we have also been testing experimental strategies to perform RNAseq on single cells. This is challenging partially because of limit of detection issues and quantitation accuracy, especially for lowly abundant transcripts. To address these issues, we have adopted a barcoding strategy that mimic the principle used in digital PCRs. 3. We have been using ChIP-Seq and CAGE-Seq approaches to assay chromatin, enhancers and transcription-factor binding states in human macrophages before and after different types of perturbations. These data are being integrated with gene expression, both at the bulk and single-cell levels, to dissect regulatory networks within and across macrophage activation states. 4. Develop computational data analysis approaches and methods. In addition to computational methods for processing and analyzing individual data types, a key goal is to integrate the data obtained from different perturbation conditions to infer gene network(s). Since in vivo stimulations typically involve combinations of cytokines and molecular patterns, one question we aim to address is whether responses to complex stimuli can be predicted and understood based on responses to simpler constituent stimulus. For example, by using data from AfCS and data generated by our and the Fraser labs, we are developing a computational approach to predict genes and proteins that facilitate cross-talk between signaling subnetworks. This approach can also lead to better understandings of how networks evolve to process complex information and general network features for generating phenotypic diversity. Another focus is on the analysis of single cell data, particularly how to estimate heterogeneity in the presence of noise and detection limits.