The explosion of biomedical big data (e.g. imaging, clinical records, and omic analyzes) that captures multiple levels of complexity has the potential to dramatically accelerate the translation of knowledge from bench to bedside. However, the effective use of these data requires skills in computer science, statistics, and bioinformatics, as well as detailed knowledge of biology and medicine to aid in the interpretation of the data analysis. Unfortunately, biomedical researchers are not trained in the computational and statistical methods needed to handle high-density biomedical big data. As a result, many biomedical scientists are frustrated by their inability to: (a) analyze big data, (b) utilize the valuable public resources containing big data, and (c) effectively communicate with computer scientists, statisticians and bioinformaticians. These barriers have significantly hampered the translational application of the large body of big data that has accumulated thus far. In order to overcome these challenges, this team proposes to create a summer training course that is built upon case studies and that is specifically designed for biomedical researchers who are novices in big data analysis. The investigators identified the need for this course in a survey of administrators and researchers at Midwest and Big Ten universities. This course will raise knowledge of the potential uses of biomedical big data and will develop skills for locating, accessing, managing, visualizing, analyzing, and integrating various types of big data that are publicly available. The proposed big data training program has three goals: (1) introduce the fundamental concepts of big data in biomedical research to raise awareness of the value of this research approach, (2) provide face-to-face instruction that develops the technical competency needed for big data science, and (3) develop educational and data analysis resources using the HUBzero platform to aid our face-to-face instruction and provide post-instruction opportunities for reinforcing and expanding technical skills. The course will exploit available big data resources and tools so that biologists can productively explore big data within a short time. The educational program will target graduate students, postdoctoral trainees, physician-scientists and biomedical scientists, with strong biomedical backgrounds but who have limited advanced coursework in statistics, bioinformatics, and computer science. This course will be centered at Purdue University, a large public university with recognized strengths in statistics and computer science, with a goal to serve scientists in the Midwest area. Also, the HUBzero platform, a unique technology developed at Purdue, will be used to house computational tools and deliver the educational program, and to lower the technical barriers that challenge participants. This approach will complement the classical curricula in biomedical training programs and serve as a foundation for more advanced training. The proposed course is directly responsive to RFA-HG-14-008 because it will enable biomedical researchers to more confidently explore existing biomedical big data, implement their own data collection and analysis plans, and communicate within research teams.