Of the >50 million protein sequences now in public databases, only a tiny proportion have been experimentally characterized, necessitating assignment of molecular function almost exclusively by computational methods. Many enzymes can be classified as members of functionally diverse superfamilies (SFs); proteins descended from a common ancestor but diverged to catalyze many different chemical reactions using sometimes highly dissimilar substrates. Because these proteins all look alike with respect a subset of active site residues common to all members of each SF, prediction of their molecular functions is especially difficult and plagued by high levels of misannotation. Moreover, these SFs contain thousands of proteins, challenging our abilities to manage the data and information about them or even to determine for which proteins experimental characterization could be best leveraged for functional annotation or mechanistic insight about others of unknown function. The overall goals are to enhance understanding of SF structure-function relationships to improve computational annotation of many enzymes, to inform experimental design of mechanistic studies and enzyme engineering efforts important for human health, and to achieve a more informed theory about how nature re-uses ancestral structural templates to evolve many new enzymatic reactions. A major outcome will be expanded computational characterization of the universe of functionally diverse SFs. The aims are: 1. Create innovative approaches and tools to enhance the protein similarity network technology we pioneered to summarize sequence/structure/function relationships in enzyme SFs and enable their facile and visually interactive exploration at many levels of detail. Major challenges will be addressed for the use and interpretation of similarity networks to establish them as a major tool of genomic enzymology. To achieve this, we will create a new approach to ensure homogenous similarity signals across similarity networks and address complexities due to the complex domain architectures of some SFs that will help avoid their misinterpretation, devise a mechanism for mapping relevant functional features to networks to support efficient visual reasoning, and develop tools to address technical challenges for network generation. 2. Apply network technology to infer functional boundaries in enzyme SFs based on active site variation. Infer functional properties for sequences from metagenomic projects at a level of detail not yet achieved by current curation efforts by incorporating these sequences into our SF networks. 3. Collaborate with experts working on SFs that pose especially relevant challenges for development and application of network technology so that we can learn how best to optimize and deploy it for functional inference for unknowns, identification of new drug targets, and to help guide protein engineering. All results, including similarity networks, alignments, and other data will be disseminated by our Structure- Function Linkage Database, served via interactive and other analysis tools.