Project Summary/Abstract Our project will create a computational resource, the Single Neuron Analyzer, to support the neuroscience community?s efforts to build a reproducible, comprehensive, data-driven atlas of brain cell types. Laboratories in the BRAIN Initiative Cell Census Network and others are generating large-scale molecular datasets from multiple regions of the mouse and human brain using single cell sequencing technology. These datasets include single cell and single nucleus transcriptomes (RNA-Seq), as well as single nucleus DNA methylomes (mC-Seq) and chromatin accessibility (ATAC-Seq). Each data type provides complementary information about the molecular identity of brain cells: transcriptomes directly measure gene expression, while epigenomic data indicates both gene expression levels and the activity of intergenic regulatory regions such as enhancers. However, there is no computational resource for integrating these data from these multiple modalities and for statistically validating the reliability and reproducibility of the cell types defined based on each dataset. The Single Neuron Analyzer will work within the framework of the Single Cell Portal, which provides horizontally-scalable, highly performant solutions that allow researchers to efficiently scale with the growing size of datasets as the technology for single cell sequencing advances. In Aim 1, we will use machine learning and cross-validation to study the reproducibility of cell types defined by researchers based on one or more datasets. The Single Neuron Analyzer will allow users to compute a quantitative score, corresponding to the area under the receiver operating characteristic (AUROC), which quantifies the degree to which cell type labels can be predicted based on independent data such as experimental replicates or complementary molecular assays. Aim 2 will build a data integration system that can jointly analyze single cells profiled by different technologies and modalities, including transcriptomic and epigenomic data. We will take advantage of the reliable correlation of gene expression with low gene body DNA methylation and high chromatin accessibility, to link cells measured in one modality with their closest matching neighbors in the other two modalities. The resulting neighbor graphs will be used to impute the missing data, followed by joint cluster analysis and low-dimensional projection of the integrated dataset. Following joint analysis, the system will provide a variety of visualizations and downloadable reports about key markers for each cell type. By combining transcriptomic and epigenomic information, the system will predict cell type specific genes as well as putative enhancers. Single Neuron Analyzer will offer researchers across the neuroscience community a resource for rigorous multi-modal molecular analysis of neuronal cell types, helping to advance the goal of comprehensively understanding the brain?s cellular parts list.