Tumor biology depends upon intrinsic tumor characteristics, heterotypic cell-cell interactions, and the modulation of both by environmental factors. Computational models that incorporate all of these biological determinants are needed to accurately predict tumor biology and patient prognosis. In this project, we will identify human-mouse conserved biology by integrating multiple data types coming from human tumors, human and mouse cell lines, and mouse tumor models, and then use these data to build improved outcome predictors for breast cancer patients that can be used to help make treatment decisions. We will combine high dimensional tumor genomic data (expression, copy number, and MicroRNA) with information from cell-cell interactions and stress responses for outcome predictions. The mouse tumor models provide a rich resource for the identification of important tumor biology (i.e. modules) that will increase our knowledgebase regarding carcinogenesis in both species, and these modules will be objectively tested for prognostic value in humans. We recently developed a risk of relapse predictor based upon a Cox proportional hazards model that showed good discriminatory accuracy across all breast cancer patients. A unique aspect of our model was that it combined gene expression (5 intrinsic subtypes) and clinical variables (tumor size and node status) and was accurate in predicting 7 year relapse probabilities. We will test the predictive value of new genomic modules identified in this project by adding them to this Cox model and determining if they improve outcome predictions. These new modules will be derived from an existing database of ~1000 human breast tumors with gene expression and clinical data and a complementary database of gene expression from 250 mouse mammary tumors from 23 different models. Even this large comparative resource is inadequate to identify most biologically relevant modules, and therefore, these data will be supplemented with new data on MicroRNAs, tumor DNA copy number changes, experimental data on tumor microenvironment and stress responses obtained from in vitro cell line co-cultures and whole mouse studies. An important facet of our analytic method is that it can use data from experiments performed in mice, translate these to humans, and then simultaneously utilize disparate data types (like gene expression and clinical variables) in a single evaluative framework. Emphasizing conserved features across species, the aggregation of gene-level information into modules, and the inclusion of multiple genomic data types with clinical features should provide improvements in predicting patient outcomes, and will result in advances in our understanding of breast cancer biology that may become predictive biomarkers. PUBLIC HEALTH RELEVANCE: Predicting breast cancer patient outcomes remains challenging despite many advances in the postgenomic era. This project will develop an analytic framework for integrating and simultaneously evaluating genomic and clinical data types, with the ultimate goal of developing a robust computational predictor for breast cancer patient outcomes. Special emphasis will be on placed on better integrating the role of cell-cell interactions and stress responses in predicting the clinical course of disease.