In autism, early-age biomarkers are scarce. Research is urgently needed to identify markers that precede symptom onset, convey prognostic information, or indicate disorder subtypes. Our proposed functional genomics study of early development in ASD addresses many of these biomarker goals and is an essential early step in this discovery process. Robust biomarkers have been elusive presumably since ASD is a heterogeneous developmental disorder with thousands of speculated risk genes and potential non-genetic immune factors. We hypothesize that pathway-based transcriptomic biomarkers may be informative, as shown by our recent proof- of-concept study in which leukocyte-based gene expression provided an early diagnostic ASD classifier. Our findings are reasonable since many high confidence ASD genes (e.g., transcription factors, signaling genes, etc.) and networks are as strongly expressed in leukocytes as in brain. Furthermore, hypothesized immune disruptions in ASD should also be reflected in leukocytes, especially since microglia are a type of leukocyte that are established as a brain molecular and cellular pathology in ASD. In our proposed study, we will use 1,500 RNA-Seq datasets from 1,000 ASD and typically and atypically developing toddlers to identify biomolecular pathway biomarkers for early detection, prognosis, clinical progression and clinical subtyping. We will further study biomarker relationships to ASD gene defects and expression patterns in early neural development. Aim 1 will analyze RNA-Seq data from 1,000 1-2 year olds using data-driven and knowledge-based network approaches to identify early ASD diagnostic biomarkers that distinguish ASD (n=390) at ages 1-2 years from non-ASD (n=610) groups. Diagnostic biomarkers will include pathways and co-expression networks to address the heterogeneity across ASD subjects. Aim 2 will identify prognostic RNA-Seq expression patterns in the 390 ASD 1-2 year olds by analyzing gene expression levels to reveal pathways that predict good/poor social and language outcome at ages 3-4 years. Aim 2 will also look longitudinally at ASD (n=300) and typically developing (n=200) expression data to identify transcriptomic trajectories that underlie clinical progression from 1-2 years to 3-4 years in these different clinical outcome subgroups. Aim 3 will examine how variation in developmental functional genomic patterns relates to variation in social and language abilities across diagnostic categories (n=1,000) and within ASD (n=390) using dimensionality reduction and feature selecting regression. Multicollinear regressions will be used to combine multivariate trend observations of dimensionality reduction with the predictive power of regressions. Aim 4 will link key transcriptomic effects in Aims 1 to 3 to genetic variants in high-confidence and probable ASD genes that are linked to disrupted cellular pathways in our ASD subjects. Deleterious variants in those genes will be tested in hematopoietic and neural stem cells using CRISPR-Cas9 to introduce loss-of-function mutations in these genes. RNA-Seq will be used to assay the impact on ASD-relevant cellular pathways.