Bioinformatics infrastructural activities are crucial to modern biological research. Complete and up-to-date databases of biological knowledge are vital for the increasingly information-dependent biological and biotechnological research. With the recent accumulation of genome sequences for many organisms, most notably the draft human sequence, attention has turned to the identification and function of proteins encoded by these genomes. In the Universal Protein Resource (UniProt) project, funded by the NIH, major European and American protein sequence databases have joined forces and developed a central resource for protein sequences and functions providing a cornerstone for a wide range of scientists active in modern biological research, especially in the field of proteomics. The broad, long-term objectives of this project are to provide with the Universal Protein Resource a stable and comprehensive resource for information on proteins, their sequences and their functions, to enable scientists to use UniProt to identify and analyze genes and their products and to make queries across databases containing complementary information, and to provide efficient and unencumbered access to the databases produced by the UniProt Consortium. The specific aims are to maintain and further develop the UniProt Knowledgebase (UniProtKB) as the central database of curated protein sequences with annotations of sequence and functional information, to maintain and further develop the UniProt Archive (UniParc) and create the UniProtKB entry history server to ensure comprehensive coverage of all protein sequences and their annotation history, to maintain and further develop the UniProt Reference Clusters (UniRef) to provide a complete covering of sequence space while hiding redundant sequences (but not their descriptions) from view, to facilitate the use of these databases by providing user-friendly interfaces, tools for simple and complex queries and for retrieval of large datasets, down-loadable database records in defined, parsable format, and user support services;and to provide the flexibility and adaptability needed to be responsive to the changing needs of the scientific community. These databases produced by the UniProt Consortium will facilitate development of preventive and curative strategies for health maintenance by allowing researchers to integrate the enormous amount of data from the Human Genome Project and other genome projects as well as from structural and functional genomics and proteomics projects to understand the genetic and biological mechanisms causing human disease.