Cardiovascular disease is the leading cause of death in the United States and other Western countries. Although many factors add to the risk of cardiovascular disease, genetics and environmental factors have both been implicated as risk factors. Many key physiological studies have been performed on multiple rat strains to dissect the mechanisms of cardiovascular disease, including hypertension, myocardial infarction, and peripheral vascular disease. However, the value of these data sets is limited by the inability to integrate the phenotype results with other similar studies performed on different rat strains or under different environmental conditions. The goal of this proposal is to provide a novel and powerful approach to integrate three large phenotype data sets to allow the scientific community to access, visualize, and analyze data to attach physiological traits to the genome. To achieve the integration of these phenotype data sets, we propose: 1. Develop new and adapt existing ontologies for data integration. Using cardiovascular phenotype data generated from three major rat projects, SCOR-Molecular Genetics of Hypertension, the PhysGen Program for Genomic Applications, and the National BioResource Project for the Rat in Japan, ontologies will be developed for the four major experimental parameters common to rat phenotype data: 1) clinical measurement; 2) assay type; 3) experimental conditions; 4) sample rat strain. Emphasis will be on areas related to cardiovascular phenotypes with overall structures created to allow expansion to other phenotype areas and other model organisms. 2. Develop data structures for integration of data and data repository. A database will be created with appropriate tables and fields for storing data related to each study as well as the four experimental parameters related to phenotype and the actual phenotype values. Data in the three existing datasets will be mapped to the appropriate fields and ontologies and loaded into the data repository. 3. Provide public access to ontologies and integrated dataset. Access to the integrated dataset will be provided at the Rat Genome Database (RGD) with appropriate data mining, display and download tools for users. The ontologies will be made available in OBO format through the RGD FTP site and will be submitted to the National Center for Biomedical Ontologies for availability through their BioPortal. Educational tools including online video and text tutorials will be developed. Evaluation procedures will include approaches to obtain and integrate input from the scientific community. Together, these ontologies, data structure and repository, and the public access portal will tie together essential phenotype data needed to continue analysis of the genomic basis of cardiovascular diseases. PUBLIC HEALTH RELEVANCE: Cardiovascular disease is the leading cause of death in the United States and other Western countries. Combining physiological function with genomic data will allow researchers to more rapidly identify genes involved in cardiovascular disease. The overall goal of this project is to develop ontologies and standardized data formats for physiological data to enable integration of large physiological data sets and to link this integrated information with existing genomic resources, thus advancing our ability to elucidate the genetic basis of disease.