Project Summary/Abstract Research: Here we aim to use cross-feature correlations in three different contexts in single cell omics to (Aim1) solve critical issues in single cell RNAseq (scRNAseq) cell type identification, (Aim2) discover subtypes of asymmetric cell division (ACD) by the creation of a new genomics technology [single cell ACD transcriptomics (scACDt)], and (Aim3) create an anthology of scRNAseq co-expression networks across human tissues. (Aim1) We have found that status quo cell type identification algorithms (1) cannot identify immortalized cell lines as a single cell type, and (2) have no unbiased mechanism to prevent a user from repeatedly ?sub-clustering? populations of interest, which can result in false discoveries. These problems have immediate implications for the analysis of all scRNAseq, thus requiring an urgent resolution. We have created an anti-correlation-based algorithm that appears to pass these tests, but must expand our benchmarking with more simulation studies, more competing algorithms, and real-world datasets. (Aim2) Similar to Aim1, we anticipate that anti-correlated vectors will define subtypes of ACD. Using an opto-electric nano-fluidic chip, we will track daughter cells by microscopy and pair them with their transcriptomes by scRNAseq following cell division to calculate the asymmetry in mRNA segregation between daughter cells. We have previously performed all needed functions to achieve these goals; here we propose to merge these protocols to create a new genomics assay (scACDt). (Aim3) Lastly, we will use cross-feature correlations to build consensus tissue and pan-tissue co-expression networks from publicly available human scRNAseq datasets. This will enable functional annotation of the entire NHGRI GWAS catalogue using graph theoretic approaches from gene-gene correlations. Career Goals: My future laboratory will use transdisciplinary approaches to develop new genomic technologies and algorithms to uncover the mechanisms by which the genome, integrated with environmental input, results in a diverse array of cell types and expression programs. Through integrated data science, algorithm development, and basic molecular biology, my lab will generate data-driven hypotheses and validate them at the bench. These approaches will broadly impact all of biology rather than on a single disease. Lastly, an important goal is to create a socio- economic and geographically diverse lab-environment. The training and aims I propose here will guide me to these goals. Environment: The Icahn School of Medicine at Mount Sinai (ISMMS) has an established systems biology track record with access to and expertise in massively scalable computation, which will be important for Aims1&3. Additionally, ISMMS is the only academic institute to own the Beacon platform let alone have the expertise to operate this instrument for Aim2. Through our collaborations within the institute, our team at Mount Sinai is uniquely situated to (Aim1) create innovative algorithms to identify cell types from scRNAseq, (Aim2) begin the scACDt field, (Aim3) create an anthology of scRNAseq co-expression networks across human tissues.