Project Summary The proposed project is in response to the U01 Funding Opportunity Announcement (FOA) for the Big Data to Knowledge (BD2K) Development of Software Tools and Methods for Biomedical Big Data in the topic area of applying metadata. The overall goal here is to design, develop and evaluate an integrated platform for clinical research metadata standardization leveraging both standards-based representation and scalable Semantic Web technologies. The ultimate goal is to advance clinical research data discovery and analytic capabilities for clinical and translational centers and investigators. Clinical and translational research studies increasingly involve the manipulation of large datasets (e.g., patient records and genomic profiles) and the application of complex methods. To derive clinically relevant conclusions from such large datasets, the clinical and translational research community faces significant data integration challenges related to scalability, interactivity, representation standards, sustainability, and robustness. Failure to deal with these challenges will have a significant negative impact on downstream data reuse, sharing and analysis in the broader scientific communities. Detailed Clinical Models (DCMs) have been regarded as the basis for retaining computable meaning when data are exchanged between heterogeneous clinical systems. Amongst the emerging national and international initiatives on the standardization of DCM modeling are the Clinical Informatics Modeling Initiative (CIMI) and the HL7 Fast Healthcare Interoperability Resources (FHIR). FHIR is an emerging HL7 standard; it leverages existing logical and theoretical models to provide a consistent, easy to implement, and rigorous mechanism for exchanging data between healthcare applications. However, currently the toolbox that enables HL7 FHIR as a global data model to standardize clinical research metadata is very limited. Such metadata include data dictionaries associated with clinical research datasets and a variety of underlying data models in the existing integrated data repositories (IDRs) such as the Informatics for Integrating Biology and the Bedside (i2b2). The proposed project leverages emerging Semantic Web technologies to provide a scalable standards-based framework that enables effective and efficient big data integration and semantic sharing. The proposed project builds on semantic metadata software and infrastructure developed in our previous projects, including an NIH U24 bioCADDIE (biomedical and healthCAre Data Discovery Index Ecosystem) pilot project (PI: Jiang) that investigates the feasibility of indexing clinical research datasets using HL7 FHIR, and an NCI U01 supplement (PI: Jiang) that creates an open-source IDR (e.g., i2b2) with FHIR- based cancer data services for cancer research. The objective of the proposed project is to consolidate, develop, and evaluate methods and tools for standardizing clinical research metadata and data models using HL7 FHIR. Our specific aims are: 1) Consolidate our bioCADDIE tools for indexing clinical research metadata using HL7 FHIR; 2) Create methods and tools for integrating i2b2 clinical data repository with HL7 FHIR; 3) Deploy an integrated web-portal for community-based metadata harmonization and tool dissemination. The proposed project will produce a suite of methods and tools for clinical research metadata standardization using HL7 FHIR and effectively facilitate secondary use of clinical research data and applications, ultimately advancing clinical and translational data discovery and analytics.