The methodological objective of Project 3 is to develop statistical and computational models for the analysis of gene expression profiles, and apply them to data. Comparisons of the various existing methodologies (Self-Organizing Maps, Joining Algorithms and other) and selection of the most promising will be first performed using stimulated data sets (Aim 1). An important consideration will be to determine the statistical properties (power, significance) of the different methods, under different assumed patterns of gene expression. The applicable objective will to guide the analysis of gene information gained from Project 1 and 2, suing the SAGE and RAGE techniques (Aim 2). In a first phase this Project will explore the use of existing methods of cluster analysis to determining molecular signature of tumor recurrence based on multiple gene expression patterns, in retrospectively collected tumors with known outcome. The Phase I analyses that will follow using real gene expression data will be of much use in helping to determine the best candidate genes for the preliminary set of signature genes. Subsequently, new methods will be developed to model and estimate other, random trees and Markov chains, will be used. Evaluations, comparisons and testing of new algorithms will be first performed using stimulated gene expression datasets with known and adjustable numbers of variables. Subsequently, data on several hundred tumors obtained using the SAGE and RAGE techniques will be analyzed. A final objective is to develop computational methodologies, which ultimately could be applied via user-friendly software tool packages which will be developed using the software programming support to be provided via the Statistics and Data Management Core B (Aim 4).