A flexible facility for genetic computation, including both ongoing software development and expansion of hardware resources, is essential for high throughput sequencing operations to succeed. As the technology is new and still evolving rapidly, different platforms are introduced two to three times per year, each requiring some adjustment in order to harness the data pipeline. Stable artifacts must be monitored and recorded in order to eliminate most false calls; annotated coding region and splice junctions must be updated periodically; putative discrepancies that change coding sense must be parsed from those that do not; and finally all calls must be validated or excluded from consideration. Once authentic causative or incidental mutations are established, the data must be ported to a permanent repository without the introduction of error by human operators. This repository, built during the first period of funding, is Mutagenetix. A parallel repository will be established for drosophila mutations. All of these tasks will fall within the purview of Core E. In addition. Core E will model mutagenesis to allow optimized use of ENU in somatic cell mutagenesis studies, to be carried out in Project 1, and in future projects. Core E will be supported by hardware that includes a two large (-3,000 node) Linux cluster computers at the Texas Advanced Computing Center (TACC), a smaller dedicated cluster devoted solely to mapping short reads, and servers that support Mutagenetix (now accessed approximately 3,000 times per month by approximately 1,000 independent users worldwide). RELEVANCE (See instructions): Core E will use massively parallel computing to model mutagenesis and to find mutations that either impair the host response or create resistance to viral infection. This will contribute to the advancement of human health by pointing to drug targets: proteins that may be blocked or augmented in their activity to inhibit viral infections.