Core G: Biostatistics and Data Science ABSTRACT The Biostatistics and Data Science Core (Core G) plays a critical role within the Penn CFAR by ensuring that studies carried out by CFAR investigators employ optimal statistical approaches that result in scientifically rigorous, high quality, reproducible, impactful discoveries. To do this, Core G provides (a) high quality expertise in statistical design and analysis of research studies; (b) support for database development and management critical for CFAR investigations; (c) leadership in evaluating, developing and disseminating optimal statistical methods for HIV-related research; (d) education, training, mentorship in statistical methodology in order to ensure the highest level of methodological standards; (e) capacity building with the CFAR's international partners in Botswana; and (f) a central nexus for engagement of the broad epidemiology, statistics and informatics faculty in high priority HIV/AIDS research. Core G is led by Drs. Susan Ellenberg (Director) and Pamela Shaw (Co-Director), along with Dr. Alisa Stephens- Shields (Core Investigator); and Mr. Chris Helker, who oversees data management activities. Core G provides statistical and data management input into studies being developed for funding applications by CFAR investigators, postdoctoral researchers and students working in the labs of CFAR investigators, and frequently assumes important leadership roles in funded projects. The Core supports both project-specific databases and the CFAR Clinical Core Adult/Adolescent Database of >3000 subjects and the linked specimen repository. Mentoring, over and above advice on specific projects, is a major focus, and Core G presents seminars on statistical methods in clinical and laboratory research, and provide design and analysis in support of Developmental Core Pilot Grants. International capacity-building activities of Core G include leading an initiative to develop a biostatistics curriculum at the University of Botswana (UB) together with UB colleagues, and extensive involvement in several additional Botswana-based training programs. Looking ahead, Core G will continue its highly successful current activities, and in addition will lead the development of expanded CFAR collaborations with the Philadelphia Department of Public Health in collaboration with the Administrative and Clinical Cores, increase outreach to engage non-AIDS investigators in the Statistics and Epidemiology communities, and leverage and engage the expertise of the newly-established and rapidly-growing Penn Institute for Biomedical Informatics to expand use of natural language processing and other emerging informatics technologies by CFAR investigators.