SUMMARY: Over 25,000 researchers in the US and over 50,000 in 120 other countries have exploited the PredictProtein (PP) Internet server to analyze proteins by homology-transfer and by eye novo predictions of protein structure and function. Here, we propose technical and scientific solutions that will improve the functionality of PP and its extension portal META-PP. Many technical changes will remain hidden to users and are required to increase the maintainability, scalability, and portability of these servers. New Graphical User Interfaces are one proposed solution that will visibly impact the service. The scientific solutions address two related tasks pertaining to the prediction of structure and function. The first is to predict the effect of mutations. We propose the development of novel machine learning-based methods to distinguish between mutations that affect structure, function, or have no apparent phenotype. Our final method will be applied to the screening of SNP data from our experimental colleagues at Columbia, as well as to the prediction of SNP effects in public databases. The second major task is the identification of natively unstructured regions and their functional classification. Proteins that do not adopt regular structures in isolation are increasingly becoming an important research area;they may provide a key to the evolution of complexity from prokaryotes to eukaryotes. We propose the development of a machine learning-based identification of features specific to this important class of molecules. We also plan to attack the problem from a very different angle by using predictions of interaction densities inside proteins. The resulting novel tools will allow a proteome-wide analysis of the role of these molecules. All methods will be made available through PP. RELEVANCE: Information about protein structure adds an entire dimension to protein analysis and genome annotation. This addition is often essential to infer function even for natively unstructured proteins. The PredictProtein server is unique in its combination and exploitation of evolution, structure, and function;many thousands of theoretical, experimental, and clinical researches have benefited from this. The long-term goal of the research proposed here is to improve our ability to use the evolutionary record of amino acid substitutions, i.e. to ultimately understand the amino acid "language". The short-term goal is to address two tasks that are closely related to human diseases, namely the distinction between silent and important mutations and the mapping of unstructured proteins onto networks and diseases.