In our rapidly evolving information era, methods for handling large quantities of data obtained in biomedical research have emerged as powerful tools for confronting critical research questions, with significant impacts in diverse domains ranging from genomics to health informatics to environmental research. The NIH's Big Data to Knowledge (BD2K) Training Consortium is expected to empower current and future generations of researchers with a comprehensive understanding of the data science ecosystem: the ability to explore, prepare, analyze, visualize, and interpret Big Data. To these ends, we propose a novel Training Coordinating Center (TCC) to coordinate the diverse activities occurring within the BD2K Training Consortium into a synergistic training effort. The TCC will create an inclusive and collaborative virtual environment - entitled Big Data U - serving trainees from a wide spectrum of educational backgrounds and scientific domains. Big Data U will make personalized educational resources easy accessible and facilitate novel research collaborations through scientific rotations. We will harvest the web to automatically identify, model, and incorporate online resources into an Educational Resource Discovery Index (ERuDIte) and a Big Data U Knowledge Map. This unique system will alleviate the burden of sifting through hundreds of educational resources and searching across multiple research and training program websites, allowing users to easily determine which resources are didactically significant and correspond to the appropriate scientific domain of interest, level of education, and learning objective. Over the long term, our efforts will cultivate a diverse network of data scientists that can propagate their knowledge and experience for generations to come. Our PI and team have a demonstrated commitment to training in biomedical data science. The University of Southern California is ideally suited to host this NIH BD2K effort, having a strong history of data science training and recently founded two new masters programs of relevance to Big Data biomedicine. The TCC is the logical extension of our outstanding track record in data science, and we will leverage our comprehensive experience and infrastructure in developing the TCC.