Project Summary Advances in genomics and data analytics create new opportunities for accurate risk prediction and personalized medical treatment for even rare cancers via large-scale data federation across institutions. Yet cancer research is often stymied by a lack of appropriate tools to streamline the transfer and sharing of clinical patient data for cancer research. Globus services permit secure data transfer, synchronization, and sharing in distributed environments at large scale. We propose here to extend these services so that they are appropriate to work securely with protected human data. The extended services will allow federation of clinical patient data for accurate cancer risk prediction, personalized treatment, as well as any other cancer research area. Globus is widely used, with over 15,000 users, more than 8,000 storage systems accessible via Globus, including at most leading US universities and many sites overseas, and more than 165 petabytes and 25 billion files transferred. Adoption of Globus by biomedical researchers has been rapid and is accelerating. Biomedical researchers at ~30 universities, government agencies, and sequencing centers have relied on Globus for streamlined data transfer and sharing. Our ?Globus Genomics? (GG) integrated Galaxy-Globus-cloud genomics analysis system has been used by more than 300 researchers across multiple biomedical research domains, including cancer, at over 25 institutions to analyze over 10,000 samples. We will develop a HIPAA Enablement Toolkit that will enable Globus and other software-as-a-service providers (including GG) to manage protected data securely (Aim 1.1). We will extend Globus security features by implementing file name encryption and by encrypting data with user-supplied keys, and demonstrate that these new features can be used by GG and other services to enable elastic, secure, high-performance cancer genomics data analysis (Aim 1.2). We will integrate Globus with major cloud platforms by developing uniform storage system interfaces (Aim 2.1), engineering high-speed transfers (Aim 2.2), and implementing search, replication, and synchronization (Aim 2.3) on AWS, Google, Microsoft, and OpenStack-based clouds, so that cancer researchers can transfer and share data securely and easily among these and other (e.g., local) computing and storage platforms. The resulting tools will be applicable to any cancer type across the cancer research spectrum. We will validate and disseminate these new technologies first within existing and emerging breast (Aim 3.1), blood (Aim 3.2), and pancreatic (Aim 3.3) cancer research networks and then more broadly with collaborators across the cancer research continuum (Aim 3.4). We will work closely with collaborators and users to ensure that we meet the needs of a broad cross-section of the cancer research community that requires transfer, sharing, and analysis of large, human data sets. We will use extensive community outreach through multiple channels to widely disseminate our technologies.