Understanding enzyme function is a key component to many public health issues. These include identifying new drug targets for infectious diseases, understanding how mutations in enzyme encoding genes lead to diseases, and identifying enzymes involvedin producing natural products of interest. One limitation to genome-wide analysis (e. g. microarrays) and many biological databases is that they are dependent upon the correct annotation of protein function. In fact, whole systems have been designed to analyze microarray data based mainly on predicted protein function, and while gene annotation has improved, annotation errors still exist. Efforts to improve gene annotation will improve the quality and efficiency of research in the biomedical sciences. This research will generate an ontology to describe how enzymes interact with their substrates and other small molecules (e.g. allosteric regulators). A database will store the functionally important amino acids of specific enzymes and how those amino acids are involved in substrate binding and other types of interactions based on the created ontology. The objective of the database is to create a system that will allow the function of all of the known important amino acids in an enzyme to be described in a detailed but organized manner. Queries of the database then can be performed in a number of methods: 1. If given a protein sequence with a proposed function, it can determine which of the amino acids known to be critical for that function are conserved. 2. Identify all of the enzymes that use a specific amino acid to interact with a specific part of a given molecule (e.g. all of the enzymes that use a lysine to interact with the phosphates of ATP). 3. Using basic statistical approaches, queries such as "Do a significant number of enzymes that bind to ATP use a lysine to interact with the ATP?" can be performed. These types of queries have been employed to mine data from gene ontologies and have helped to revolutionize biology. This proposed research will apply these powerful principles to biochemistry and enzymology for the first time. In addition, this database system will allow the broader biomedical research community to easily identify and understand the importance of critical amino acids in enzymes of interest. PUBLIC HEALTH RELEVANCE: The biochemical understanding of how enzymes function is a key component to public health as mutations in enzyme encoding genes can cause diseases and enzymes are frequently drug targets. Creation of the Functional amino Acid Navigator (FAN) database will allow complex queries to be performed regarding how enzymes function, and in general, will allow more efficient use of time as information from decades of research on enzymes will be collected and stored in an easily queried format.