NIH S10 equipment proposal: From genomics to natural language processing: A protected environment for research computing in the health sciences. Health sciences researchers are often required to manage, mine, and analyze restricted patient data (Protected Health Information, PHI) to facilitate and advance their research aims. They are often required to do this without access to central information technology expertise or resources to facilitate their research aims. These researchers are often left to their own devices to ?solve? their research compute and data needs and are challenged due to lack of available resources, barriers from central IT, and/or lack of knowledge of available resources. A further challenge is that ?small? data sets? data that researchers could formerly handle on office resources?have morphed and grown into the big data domain through the explosion of technical advances and significant expansion in various research directions. Examples include: genomics research, image analysis, simulation, natural language processing, and mining of EMRs. Therefore, the need exists to develop a framework for managing and processing this data securely and reliably. This S10 equipment proposal is to replace the ?protected environment? (PE) prototype the University of Utah?s Center for High Performance Computing (CHPC) and Department of Biomedical Informatics built six years ago and has operated since. The PE consists of both high performance computing and virtual machine (VM) components and associated storage sufficient to manage, protect and analyze HIPAA protected health information. This environment has been very successful and has grown significantly in scope. CHPC isolated this protected environment in the secured University of Utah Downtown Data Center and setup a network protected logical partition that provided research groups specific access to individual data sets. As the environment and technology developed, CHPC added additional security features such as two-factor authentication for entry and audit/monitoring. Unfortunately, the prototype has reached the point where demand is surpassing capability and all the hardware is aged and off-warranty. To give an idea of users of the virtual machine farm component, the Biomedical Informatics Core (BMIC) REDCap (Research Electronic Data Capture) environment for data collection has over 2,500 users in 1,500 projects supporting over $25M in NIH funding at the University of Utah, including support for more than 25 active NIH R-01 grants. Moreover, the HIPAA compliant protected environment was a key factor that aided passing the recent University of Utah HIPAA audit. The ?protected environment? also helped the University of Utah Health Sciences Center and the BMIC justify the NCATS Center Clinical and Translational Science award (1ULTR001067).