Project Summary/Abstract The goal of the proposed project is to develop core algorithms, techniques and software libraries to enable scalable, efficient and parallel computing solutions for mass spectrometry (MS) based high-throughput proteomics data sets. To empower the larger proteomics community and experimental biologist the project seeks to 1) identify a set of core methods that are frequently used by proteomics practitioners 2) develop efficient and scalable parallel algorithms and implementations for these methods 3) pursue mapping of these parallel computing techniques to a wide variety of architectures such as multicores, manycores, distributed clusters, GPU?s and FPGA?s 4) design and implement big data analytic techniques that can be used in our HPC implementation as well as used by other researchers for sequential and/or parallel algorithms 5) design interfaces using Galaxy framework for these parallel programs so that they can be used by non-experts and people who are not familiar with parallel processing. The research will be conducted in collaboration with domain experts in systems biology and proteomics. The specific problems that will be targeted are parallel algorithms for clustering of MS data sets, parallel algorithms for identifying peptides using databases from these MS data sets using multicore and GPU?s and high performance algorithms that can make sense out of these MS data sets in a denovo fashion without a need for a database. The parallel algorithms will be tested using simulated as well as real experimental data sets and will be available for free academic use.