Core B: Data Acquisition and Construction Our projects' data requirements overlap extensively, and an Aim of this program project is to provide data to catalyze work on aging and innovation and person-based studies of innovation in the broader research community. The data necessary to study these problems are currently scattered across sources and formats and have not been linked, posing a formidable barrier to research. The Data Acquisition and Construction Core will develop, maintain, and distribute a number of integrated, large-scale datasets and tools that will provide infrastructure for the project and be provided freely in a user-friendly form and with support to the scholarly research community (including graduate students and researchers at non-profits and government agencies) in perpetuity. Generating this infrastructure centrally will ensure it is fully integrated, minimize duplication of effort; ensure quality and uniformity; take greatest advantage of the expertise of program participants; and establish a common set of methods for all users. The availability of this data infrastructure and established procedures will support a dynamic field studying aging and innovation and person-based studies of innovation. A central component of our work will be the construction of a large-scale, disambiguated, individual-level, longitudinal database on biomedical researchers comprising: (1) publications, (2) patents, (3) grants, (4) citations, (5) biographic data, (6) research institution characteristics and quality rankings and (7) journal quality. We will also develop: (1) a longitudinal dataset on research areas, including research effort, drug approvals, and health outcomes, which can stand alone and will also be combined with the individual-level dataset; (2) a set of data extraction and manipulation tools that will facilitate the use of these datasets; (3) estimates of the health and economic impacts of biomedical research; and (4) metrics to identify high-impact and transformative research. The project draws together a team with complementary skills that is uniquely suited to perform this work along with a sophisticated group of end-users who can refine the data, add complementary components, and maximize usability. RELEVANCE (See instructions): The US is increasingly emphasizing innovation, but the aging of our scientific workforce is expected to reduce innovative output. This Core will develop the data infrastructure to support both our work and future work that will provide policy-relevant information about how the aging of our scientific workforce will affect our biomedical innovative output, the associated health and economic consequences, and policy responses.