PROJECT SUMMARY In this dawning era of `Big Data' it is vital to recruit and train the next generation of biomedical data scientists in `Big Data'. The collection of `Big Data' in the biomedical sciences is growing rapidly and has the potential to solve many of today's pressing medical needs including personalized medicine, eradication of disease, and curing cancer. Realizing the benefits of Big Data will require a new generation of leaders in (bio)statistical and computational methods who will be able to develop the approaches and tools necessary to unlock the information contained in large heterogeneous datasets. There is a great need for scientists trained in this specialized, highly heterogeneous, and interdisciplinary new field of health big data. Thus, the recruitment of talented undergraduates in science, technology, engineering and mathematics (STEM) programs is vital to our ability to tap into the potential that `Big Data' offers and the challenges that it presents. The University of Michigan Undergraduate Summer Institute: Transforming Analytical Learning in the Era of Big Data will primarily draw from the expertise and experience of faculty from three different departments within three different schools at the University of Michigan: Biostatistics in the School of Public Health, Computer Science in the School of Engineering, Statistics in the College of Literature, Sciences and the Arts. The faculty instructors and mentors have backgrounds in Statistics, Computer Science, Information Science, Medicine, Population Health, Social and Biological Sciences. They have active research programs in a broad spectrum of methodological areas including statistical modeling, data mining, natural language processing, statistical and machine learning, large-scale optimization, matrix computation, medical computing, health informatics, high- dimensional statistics, distributed computing, missing data, causal inference, data management and integration, signal processing and medical imaging. The diseases and conditions they study include obesity, diabetes, cardiovascular disease, cancer, neurological disease, kidney disease, injury, macular degeneration and Alzheimer's disease. The areas of biology include neuroscience, genetics, genomics, metabolomics, epigenetics and socio-behavioral science. Undergraduate trainees selected will have strong quantitative skills and a background in STEM. The summer institute will consist of a combination of coursework, to raise the skills and interests of the participants to a sufficient level to consider pursuing graduate studies in `Big Data' science, along with an in depth mentoring component that will allow the participants to research a specific topic/project utilizing `Big Data'. We have witnessed tremendous enthusiasm and success with the current summer program on Big Data led by this team with 164 students trained in the last 4 years (2015-2018) including 90 female students and 30 students from underrepresented minority groups. Fourteen of these participants from the last three years are currently graduate students in Michigan Biostatistics. The ongoing program has gained traction in the national landscape of summer research programs with 20% rate of admission and 80% rate of acceptance among those who are offered this opportunity. The program has consistently received very strong evaluation and our past alumni have become brand ambassadors and advocates for our program. We plan to build on the success and legacy of this program in the next three year funding cycle of this grant (2019-2021). The overarching goal of our summer institute in big data is to recruit and train the next generation of big data scientists using a non-traditional, action-based learning paradigm. This six week long summer institute will recruit a group of approximately 45 undergraduates nationally and internationally, with 20 domestic students supported by the requested SIBS funding mechanism and others supported by supplementary institutional and foundation support. We propose to expose the trainees to diverse techniques, skills and problems in the field of health Big Data. They will be taught and mentored by a team of interdisciplinary faculty, reflecting the shared intellectual landscape needed for Big Data research. They will engage in mentored research projects in three primary areas of health big data: Electronic Health Records/Medical Claims, Genomics and Imaging. Some of the projects will be defined in the area of cardiovascular precision medicine, defined by a team of highly quantitative researchers engaged in cardiovascular research that uses big data. At the conclusion of the program there will be a concluding capstone symposium showcasing the research of the students via poster and oral presentation. There will be lectures by U-M researchers, outside guests and a professional development workshop to prepare the students for graduate school. We propose an inter-SIBS collaboration with Dordt College summer program trainees who will attend this concluding symposium. The resources developed for the summer institute, including lectures, assignments, projects, template codes and datasets will be freely available through a wiki page so that this format can be replicated anywhere in the world. This democratic dissemination plan will lead to access of teaching and training material for undergraduate students in this new field across the world. We will offer multiple professional development opportunities and resources for graduate school preparation to our trainees so that they can reflect and plan beyond their senior year. All of our proposed activities are reflected through our three specific aims: Teaching, Mentoring and Dissemination.