This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. We have collected serum samples from 54 patients diagnosed with breast cancer and 443 patients who have been disease free for several years. Samples were collected from cancer patients before surgery and at month intervals for seven months after surgery. Blood draws from control set of patients was drawn at different time points. All blood draws were performed in the same facility under the same protocol. Albumin was extracted from the serum samples and proteins attached to the albumin were captured. High resolution orthogonal MALDI-TOF mass spectrometry was performed on these extacted proteins. Technical repeats were performed for all samples yielding 1750 total spectra containing approximately 500,000 data points each. Peaks from the resulting spectra were used to build a cancer vs normal classifier for the samples. See: Proteomic patterns for classification of ovarian cancer and CTCL serum samples utilizing peak pairs indicative of post-translational modifications. Proteomics. 2007 Nov;7(22):4045-52. Preprocessing of the spectra plays an important role in correctly identifying differential peaks among samples. There is no currently accepted standard method of preprocessing spectra. In this study we propose to apply two different preprocessing workflows to the entire data set and build a classifier to compare the effect of the preprocessing method on final outcome. Both preprocessing workflows are implemented in Matlab and are available to the general public. Although Matlab may not be the most efficient implementation, ready access to the methods outweighs the small overhead associated with Matlab in this case. The two workflows we will test are: 1. PrepMS, from Texas A&M Department of Statistics: http://www.stat.tamu.edu/~yuliya/prepMS.html 2. Matlab demo mas spectrometry preprocessing workflow: http://www.mathworks.com/products/demos/shipping/bioinfo/mspreprodemo.html The method of denoising, normalization and peak detection differ in each workflow. Upon successful completion of this small comparison study we will test other published methods for preprocessing.