We have primarily been using THP1 (a human monocytic line), RAW264.7 (a mouse macrophage line), human blood monocytes, and bone marrow derived macrophages (BMDM) as models since these cells are reasonably well characterized with substantial gene expression profiles already available in the public domain. In addition, they are also being studied by our collaborators at the Laboratory of Systems Biology where synergy can be achieved by integrating information from additional levels (e.g., protein-protein interactions) and time scales (e.g., signaling). Our goal is to assay and understand the regulation of phenotypic diversity and plasticity and infer gene networks using data from both the cell population and single cell levels. Based on prior work on bacteria, yeast and mammalian cell lines, it is clear that even for clonal cell populations, substantial heterogeneity can be pervasive at both basal and stimulated states. To achieve our goals, we have been developing systematic approaches in several critical areas: 1. Develop and apply physiologically relevant perturbations to assay phenotypic diversity. For network inference purposes, systematic perturbations that can generate expression variations across genes and components in the underlying network are also needed (e.g., systematic genetic alterations). The Alliance for Cellular Signaling (AfCS) has already generated publicly available data on single and multi-ligand Toll-like receptor (TLR) stimulations. Our collaborators at the LSB, including Iain Fraser, Ronald Germain, Aleksandra Nita-Lazar and their groups, have also been applying multiple Toll-like Receptor Ligand perturbations to RAW and BMDM cells. Our own effort centers on developing a set of physiologically relevant environmental perturbations, including monocyte-to-macrophage differentiation signals, combinations and quantitative titrations of cytokines and chemokines, and exposure to other immune cells; we are currently in the process of developing and testing these strategies. During the past year, we have obtained pilot RNAseq data resulting from these strategies and our analysis indicated that complex and combinatorial perturbations can induce distinct response signatures genome-wide wide in comparison to perturbation by individual or canonical ligands. We are currently in the process of validating these findings and generating further data sets based on diverse perturbations for network reconstruction. 2. Develop experimental and computational protocols and pipelines for measuring genomic phenotypes before and after perturbations. Our initial focus is on gene expression, including that of a class of non-coding RNAs (microRNAs). We are using both RNA-seq and microarrays the former can provide more detailed information, including the abundance of non-coding genes, alternative splicing isoforms and rare transcripts; microarrays can be applied to a larger number of conditions/samples because they are less expensive and analysis methods are significantly more mature for microarrays than for RNA-seq. By stimulating RAW cells with LPS, a prototypical TLR activator, we have generated RNA-seq data with deep coverage (85 million reads per sample). Using this data, we have developed computational pipelines for processing and analyzing RNAseq data and have examined basal and LPS-stimulated phenotypes at the splice isoform (both conserved and non-conserved), intergenic, and rare transcript levels. In addition to gene expression, we also plan to measure and develop data processing methods for other types of genomics phenotypes, especially methylation and chromatin states. During this past year by utilizing the deep coverage provided by RNAseq (80+ million reads/sample), we have developed computational approaches to characterize the extent of non-conserved splicing events (NCS), many of which have postulated to be results of noisy splicing. We found that NCSs are prevalent across macrophages and T cells in both resting and stimulated conditions (our collaborator Jun Zhu provided the T cell data). The extent of NCS is highly variable across genes, with certain pathway exhibiting significantly lower/higher levels of NCSs. We further tested our approach using public data sets obtained from multiple human cell lines and observed similar trends. We also observed that that some NCS events are conserved across conditions while some are condition specific. We are in the process of utilizing NCS as a means to infer factors and regulatory networks that regulate alternative splicing in cells. 3. Develop single-cell phenotyping assays for obtaining gene expression information of a large panel of genes before and after perturbations. We have been developing two complementary approaches. In the first approach we combine flow cytometry with single-cell based PCR (fluidigm) to assay the expression of hundreds of genes, including transcription factors and microRNAs that are either known to regulate monocyte/macrophage phenotype polarization or are derived based on information from cell population-based experiments (see above). We have developed a strategy to design marker panels based on multiple public gene expression data sets derived from myeloid as well as other hematopoietic cells. As a complementary approach, we are developing strategies to perform RNAseq on single cells. This is challenging partially because biases introduced by PCR and other processing procedures can affect the quantitation accuracy, especially for lowly abundant transcripts. To address these issues, we have adopted a barcoding strategy that mimic the principle used in digital PCRs. We have generated sequencing libraries using these strategies and are in the process of analyzing them. 4. Develop integrative computational data analysis approaches and methods. In addition to computational methods for processing and analyzing individual data types, a key goal is to integrate the data obtained from different perturbation conditions to infer gene-gene interaction network(s). Another goal is to understand how innate immune cells process environmental information. Since in vivo stimulations typically involve combinations of cytokines and foreign molecular patterns, one question we aim to address is whether responses to complex stimuli can be predicted and understood based on responses to simpler constituent stimulus. By using data from AfCS and data generated by us and the Fraser group, we are developing a computational approach to predict genes and proteins that facilitate cross-talk between signaling subnetworks. This approach can also lead to better understandings of how networks evolve to process complex information and general network features for generating phenotypic diversity.