Breast cancer remains a leading cause of cancer death in women, and available targeted therapies for this deadly and heterogeneous disease are sometimes recommended to patients who may not benefit from them. Hence it is essential to augment and/or refine the available standard methods of determining patients who should receive what therapy, in order to optimize efficacy of such targeted therapy. The goal of this project is to identify genomic biomarkers of response to trastuzumab-based therapy, which is targeted towards HER-2 over-expressing tumors. Over 20% of breast cancers are classified as HER-2 positive by standard methods (immunohistochemistry and fluorescent in situ hybridization) yet up to 33% of such patients will show primary resistance to therapy despite their tumors being determined by these standard methods to meet the HER-2-overexpressing criteria for receiving trastuzumab. Moreover, recent studies suggest that some patients who are not classified as HER2 positive may benefit from trastuzumab. Therefore, alternative methods using genomic markers are of utmost importance to refine the selection of patients who will or will not benefit from this potentially life-saving therapy. Using gene expression and copy number variation (CNV) data, we will employ and apply statistical and computational methods to identify signatures of genes and CNVs, separately and together, that are capable of predicting patients' outcome in their response to trastuzumab-based therapy. Additionally, we will also investigate how the different molecular subtypes of breast cancer which are present in these trastuzumab- treated patients offer clinical distinction in terms of response, and subsequently determine signatures that characterize these subtypes within their clinically distinct subgroups. The methods employed include feature selection on gene expression data to identify a subset of potentially informative genes; following feature selection, we will employ classification methods to identify signatures of features (genes and/or CNVs) that are able to predict binary classes (such as responders (CR/PR) versus non-responders (SD/PD)); these classification methods will also be extended to identify signatures of features that are able to predict multi classes (such as the different tumor subtypes). Cox Proportional Hazards will be used in the case of survival outcome to identify signatures of features that best predict time to progression of disease and time to death. We will also explore the concept of fuzzy clustering to detect potential individual networks of genes, CNVs and a combination of both that play a role in response to trastuzumab. From these clusters, we propose generating a Potential-Interaction Weighted Score (PIWS); the score, if found to be independently predictive, will be a simple score for identifying patients that should receive trastuzumab therapy.