Project Summary B progenitor acute lymphoblastic leukemia (B-ALL) remains a leading cause of childhood cancer death. With the advances in RNA sequencing (RNA-seq) technology, many recurrent chimeric genes have been identified that has led to refined classification of B-ALL and tailored therapies. Still, around 10-30% B-ALL cases could not be classified into the established subtypes, which are termed as ?B-other?, thus general chemotherapy will be applied and the outcome for many is poor. This study will apply integrative genomic data analysis to identify novel B-ALL subtypes with a focus on B-other cases. With the experience and skills from prior work, I will analyze RNA-seq data from over 2000 childhood and adult ALL cases and define novel subtypes based on distinct gene expression profiles and shared genetic alterations. Case lacking driver lesions from RNA-seq will be subjected to whole genome sequencing (WGS) to identify various genetic alterations. The remaining unclassified cases with the genetic alterations in non-coding regions will be studied by functional genomic data (ChIP and ATAC- seq) to provide mechanistic annotation. Furthermore, functional experiments will be performed to explore the role of the newly identified subtype-defining genetic alterations. In the pilot study, I have analyzed 1,988 RNA- seq samples and defined 23 distinct B-ALL subtypes, with 8 novel ones identified. Besides the ones defined by gene rearrangements, I also observed point mutations on key transcription factors could play potent role in defining novel subtypes, which include PAX5 P80R (n=44) and IKZF1 N159Y (n=8). In this proposal, I will expand the sample size and interrogate the rest B-other cases with WGS to define the residual novel subtypes. Through this study, I will provide definitive B-ALL subtypes and maximize the potential of defining new ones from B-other cases. As an exemplar of single-point-mutation-defined subtype, PAX5 P80R will be thoroughly studied in this proposal. Specifically, I will use PAX5 plus other key activating/repressing chromatin marks through ChIP-seq to study PAX5 P80R specific binding sites, coupled with the chromatin accessibility information from ATAC-seq. With the CRISPR/Cas9 knock-in Pax5 P80R mouse model, I will use single-cell sequencing of preleukemic and leukemic B cells to elucidate the correlation between genetic alterations and deregulated genes on cellular level. Moreover, the markedly overexpressed gene MEGF10 (Multiple Epidermal Growth Factor-Like Domains Protein 10) in PAX5 P80R group will be explored through in vitro and ex vivo models to test its role in cellular localization and leukemogenesis. Knock-down or -out of MEGF10 through RNAi or CRISPR will be applied in human P80R xenografts to test if MEGF10 could be a potential target for tailored therapy. The mentored phase of this proposal will occur at St. Jude Children?s Research Hospital, under Dr. Charles Mullighan, and will finish the aim of characterizing novel B-ALL subtypes. The independent phase will focus on the functional studies of PAX5 P80R or other under-studied subtypes. The institutional resources and academic environment and the planned courses outlined in my proposal will ensure my successful transition to independence.