The overall long term objective of this project is to create a high resolution database reflecting the global gene expression profile of lymph node negative (LNN) breast tumors. Our hypothesis is that the analysis of global patterns of gene expression of LNN breast cancers will allow us to identify subsets of tumors with differential clinical outcome. We will specifically compare the gene expression profile of primary T1 and T2N0,0 breast tumors which after follow up demonstrated recurrence post-surgery versus tumors that did not relapse. To this end, we will use the Serial Analysis of Gene Expression (SAGE) methodology which generates quantitative and qualitative sequence information of all the transcripts being expressed by the cancerous cells. These studies constitute a critical component of this Program Project since the results to be obtained will create the foundation for the identification and selection of a reduced set of "signature genes" which will be further analyzed prospectively in large tumor sets by the methodology described in Project 2. S. Aim 1: To generate a high resolution global gene expression database of LNN breast carcinomas. These baseline databases will be primarily used for cross-validation with Rapid Analysis of Gene Expression (RAGE) methodology (Project 2), a targeted gene expression methodology, which will concentrate in low abundant transcripts and cancer related genes. The main objective of this SAGE-RAGE complementary approach is to select the primary set of 'signature genes'. S. Aim 2: To develop, adapt and validate a LCM based methodology for SAGE analysis. This approach will allow use to analyze separately the major cellular compartments of normal and tumor breast. We will generate "Master Transcription Profiles" for a) normal breast epithelial; b) normal breast stroma; c) LNN breast tumor epithelium; d) LNN breast tumor stroma. S. Aim 3: To expand the SAGE databases to larger numbers of LNN breast carcinomas with and without recurrence, in order to obtain a highly comprehensive picture on the transcriptome of lymph node negative breast carcinomas and to identify low copy number transcripts of relevance. The gene expression datasets will fr used for the adaptation and evaluation of statistical methodologies (existing and novel) by Project 3. These initial datasets will also be used for testing of Web based data analysis and visualization tools to be developed by Core B. s. Aim 4: To identify and clone representative novel candidate target genes whose expression vary significantly between the different groups of lymph node negative breast carcinomas.