Project Summary The goal of this project is to functionally annotate genetic variants in the regulation of mRNA processing, which extends and complements the current focus of ENCODE data analysis. Recently, tremendous success has been achieved in constructing a catalog of genetic variants in disease genomes or across population. The next great challenge is to identify causal variants and elucidate their potential function in biological and disease processes. To this end, research efforts have been directed to studying variants located in protein-coding, promoter, and splice site regions due to their apparent impacts on gene expression. However, many of the newly identified disease-associated variants reside in other non-coding regions, such as introns, that may confer regulatory function to the related gene. The mechanisms of these variants have been hard to decipher. It is expected that many of them may function at the post-transcriptional level, thus affecting mRNA processing. In human, mRNA processing is extremely versatile, yet closely regulated, with most genes involved in at least one of the alternative processing pathways. Despite the importance, how to accurately identify functional genetic variants in these processes remains a key question in the field. To address this question, the large collection of ENCODE expression and protein-binding data represent an invaluable resource. We will develop novel analysis strategies to make full use of the ENCODE and other publicly available data sets, complemented by further bioinformatic prediction and experimental validations. This work will allow a previously unattained level of detection of genetically regulated mRNA processing events and provide new means to tackle the imperative task of functional annotations of genetic variants.