We are analyzing the mouse genome using new analytical tools and statistics to compare the results of several next generation sequencing (NGS) experiments. Data from ChIPseq, microarray and RNAseq experiments were included for analysis in order to further assess the role of HMGN1 and HMGN2 proteins in chromatin organization and gene expression. We developed analysis pipelines for ChIP-seq experiments of DNA sequences bound to HMGN1 and HMGN2 in wildtype and double knockout mice. The outcome of these collaborations is that we have developed an efficient and adjustable pipeline for the analysis of many NGS datasets in a reasonable time and can easily interrogate the data to further develop biological interpretations and devise new questions. We have analyzed epigenetic marks across the mouse genome in a variety of cell types to assess the changes that HMGN protein deficiency result in in the double knockouts, particularly in enhancer and super-enhancer regions in mouse embryonic fibroblasts, mouse embryonic stem cells, and in mouse induced pluripotent stem cells. We have also shown that the dynamic nature of the chromatin epigenetic landscape plays a key role in the establishment and maintenance of cell identity, yet the factors that affect the dynamics of the epigenome are not fully known. We find that the ubiquitous nucleosome binding proteins HMGN1 and HMGN2 preferentially colocalize with epigenetic marks of active chromatin, and with cell-type specific enhancers. In a collaborattion, we performed ChIP followed by massively parallel sequencing (ChIPseq) against Mediator subunits from head (Med17), middle/tail (Med14), tail (Med15 and Med2), and CDK (Cdk8) modules in budding yeast. To allow better distinction of low levels of association from experimental noise or artifacts accompanying ChIP or library amplification prior to sequencing, we compared ChIP-seq profiles from wild type yeast to med17 ts yeast after 45 min at 37C. In yeast harboring this mutation, the head module is disrupted at 37C and mRNA transcription is greatly reduced genome-wide within 30 minutes. Furthermore, comparison of ChIP against Mediator subunits and Pol II in wild type and med17 ts yeast allowed detection of decreased association of Mediator and Pol II even at constitutively active promoters having relatively low amounts of Mediator association, while the relatively short temperature shift mitigates against the likelihood of indirect effects. We also compare association of Mediator subunits and Pol II in wild type and med3 med15 yeast, which lack two of the three subunits from the tail module triad of Med2/Med3/Med15, thus providing insight into the genome-wide function of the tail module in Mediator recruitment. These experiments are currently being extended with a new set of mutants to further understand the activities of the Mediator complex in gene regulation. We show that Mediator co-activator complex regulates Ty1 retromobility by controlling the balance between Ty1i and Ty1 promoters. In order to better understand intron retention in RNA-seq data, we have developed a new software application, TPMCalculator, to quantify mRNA abundance of genomic features. We have applied this software to the TCGA cancer genomic data and continue to interpret the results. All these analyses have resulted in a determination that data needs to be processed in a more organized way. The management of next generation sequencing (NGS) data produced by different technologies such as RNA-Seq, ChIP-Seq, ATAC-Seq and DNA-methylation is complex and demands advanced bioinformatics skills. For example, pre-processing quality control and sample selection based on principle component analysis (PCA) are tasks that should be easily available for researchers producing sequencing samples. In this work we present an open source containerized framework that is easily executed on most workstations for the integration and management of RNA-Seq, ChIP-Seq and ATAC-Seq data. The framework offers a user-friendly interface to execute the basic steps in data analysis allowing researchers a quick and straightforward evaluation of samples produced. The framework is comprised of a set of NGS data analysis workflows and pipelines in CWL format, a Python-Django back-end for data management and a set of Jupyter notebooks as user interface. Analysis reports with tables, figures and plots are automatically generated from data files with details and resolution ready for scientific publication. We are in the process of finalizing this project for publication.