ABSTRACT A single stem or progenitor cell can give rise to a breathtaking diversity of differentiated cell types, but our understanding of how single cells choose their fate is limited. This is because cells make individual fate decisions regulated by both molecular and environmental factors, but it is challenging to tease these effects apart. To this end, recent advances in the ability to sequence the molecular contents of a given cell (i.e. single cell RNA-seq) represent a potentially transformative development. However, while these methods can be applied to hundreds or even thousands of cells, they return only a gene expression matrix ? the equivalent of a hypothetical study that sequenced thousands of human genomes but recorded no information about each patient. What is missing is the `metadata' of the cell: What is its regulatory and developmental state? Where was it located in situ? Who were its parents and siblings? To understand cellular decision making, we need to perform an integrated analysis of a cell's transcriptome, environment, and lineage, but unfortunately we lack the tools to directly measure these parameters simultaneously. To address this challenge, I hypothesize that the cellular `metadata' is encoded in gene expression, and therefore can be inferred from single cell RNA-seq datasets. Here, I propose to develop an integrated experimental and computational framework to simultaneously learn the transcriptome and `metadata' from thousands of single cells. I will design strategies to analyze single cell gene expression and learn a cell's regulatory state, pinpoint its environmental milieu, and reconstruct its lineage relationships. I will apply these methods to systematically decipher the regulation of cell fate during the development of the mammalian immune and nervous systems. If successful, however, this work will present a general and widely applicable strategy to study how the interaction between molecular and environmental factors governs cell behavior.