This proposal will provide information, new algorithms, and computational tools for predicting proteolytic events. The ultimate goal is to make accurate proteome-wide predictions of the substrates for any given protease. However, our current effort will focus mainly on matrix metalloproteases (MMPs), caspases, and several protein convertases (PCs) belonging to the serine protease family because a vast amount of experimental information on those proteases is already available at the Sanford-Burnham Medical Research Institute. Our approach can be easily extended to any other proteases when a statistically significant number of substrates become available for deriving a specificity profile. The unique feature of the proposed prediction method is combining sequence-based predictions with other factors. These include: structural features of the substrates, cooperative interactions, and co-localization and co-expression of substrates and proteases. We will also include information about SNPs (single nucleotide polymorphisms) and PTMs (posttranslational modifications) of the residues in the vicinity of the cleavage sites in protein substrates. These two effects can modify the proteolytic event by turning it off or by creating a new possible cleavage site. Such modifications can lead to diseases or syndromes. The proteolytic events, e.g., protease-substrate pairs, will be mapped onto the known regulatory networks. All the information that is collected and tools that are developed will be freely available on the PMAP Web site (www.proteolysis.org) for use by the biomedical research community. Because proteases usually have more than a dozen substrates, and because the substrates often differ in normal physiology vs. pathology, the impact of this project could be immense. Rather than identifying protease substrates on a one-by-one basis, our predictions will produce very-well-annotated sets of substrates that will likely have biological significance. PUBLIC HEALTH RELEVANCE: Proteolysis is a biological process involving hydrolysis of the peptide bonds in proteins. We propose to design a computational approach for predicting substrates for proteinases in human proteome that takes into account accurate amino acid sequence specificity and structural and biological factors. This computational approach will help detect aberrations in the processing, regulation, and degradation of proteins leading to disease or syndromes.