PROJECT SUMMARY Our overarching goal is to understand how the relationships between genes contribute to functional properties in neurons and how those functions combine to define types of neurons. This is a central question of basic neuroscience, and one which is newly assessable using single cell RNA sequencing (scRNA-seq). These data provide high-throughput snapshots of gene activities across thousands of cells and thus shed new light on the relationships between genes within and across cells. We propose to exploit this data in conjunction with previously known details about gene function and neuronal identity to learn new features of both. Our research approach is meta-analytic, using data from many different laboratories to obtain a more robust aggregate signal. In addition to developing meta-analytic methods to pursue our direct research interests, the methods are of broad practical relevance to neuroscience laboratories studying many different questions, including diseases of the nervous system. Disseminating our software deliverables in a convenient-to-use form is a central component of each of our research objectives. The three complementary objectives in this project are to: 1. Learn patterns of gene expression which characterize known cell identity. Building on our previous research showing conserved expression patterns across cell-types, we will define shared gene expression patterns, called co-expression, specific to neuronal sub-populations. These shared expression patterns will be used as an assay into cellular identity. 2.Identify novel cell subtypes through changes in the expression relationships between genes. Variation in co-expression is a form of transcriptional rewiring which often indicates a change in function. To find novel neuronal sub-types we will assess the data for changes in co-expression reflecting a change in functions linked to neuronal identity. We will identify novel transcriptional signatures which replicate across laboratories. 3. Determine consensus methods for customized cell-type learning. Defining wholly unknown expression profiles is likely to benefit from a variety of approaches. In order to find agreement between those approaches, we will develop an algorithm to efficiently search through gene sets likely to find those with complementary value. These gene sets will then be assessed across many pre-existing methods, with customized combinations and aggregate output reported and made available through a public web-server.