Microarray analysis is widely expected to further our understanding of disease phenotypes and yield robust gene expression signature based predictors of clinical outcome. However, as it has been frequently demonstrated in the literature, insufficient understanding of the various key characteristics of clinical microarray data sets, such as their noise structure, often impedes extracting robust, biologically and clinically meaningful results. In this grant we provide preliminary evidence that clinical microarray data sets contain a significant level of systematic bias. We identified sources of the observed technical bias, such as the overall level of mRNA integrity in a given microarray sample. We showed that this affects the expression level of many genes in concert, thus causing spurious correlations in clinical data sets and false associations between genes and clinical variables. In this proposal we are developing a method that correct for such technical biases in clinical microarray data that are produced on the various generally used microarray platforms. In specific aim 2 we will evaluate the overall impact of systematic bias correction in clinical microarray data sets. We will determine whether clinical microarray measurements show better correlation with independent validation or during cross-validation. As our preliminary results indicate, the proposed bias correction significantly increased concordance of gene expression levels with known biological relationships therefore it will likely facilitate the extraction of clinically relevant results from microarray data.