ABSTRACT: BIOINFORMATICS CORE Current genetic and genomic technologies produce a large amount of data, and it is challenging to distinguish relevant from irrelevant genomic variants. There is a need for new, widely applicable, informatics methods that can integrate and interpret genome-scale information in the context of functional networks, thus providing insight into the specific molecular processes affected by mutations that drive human disease. This application is based on the hypothesis that [A] monogenic etiologies are responsible for CDH segregating in families, with varying degrees of penetrance, [B] de novo mutations with large effect sizes are responsible for a fraction of sporadic, mostly complex, CDH cases, and [C] rare risk variants contributing to CDH can be discovered in genetic data from singletons. Statistical genetics will inform our discovery of causative variants, for example by burden tests for de novo variants compared against a large control group of sequenced normal children ascertained from the unaffected siblings of children with sporadic autism, and made publicly available as the Simons Simplex Collection (SSC). For this purpose, we have adapted analysis pipelines to identify and annotate variants, incorporating appropriate bioinformatics tools. Innovative network analyses based on Protein-Protein Interaction (PPI) and gene co-expression are an essential complement to genetic studies of variants in humans with rare diseases such as CDH (Project I) and their optimal selection for analyses in model organisms (Projects II and III). Here we detail some of the specific tools, methods, and approaches the core will provide to interrogate various large data sets to be collected throughout the duration of the project.