In Phase I, we developed two novel data mining techniques. The first one dealt with the discovery of bi-modality of the expression level of certain genes among breast cancer patients and a sequential clustering of large scale, genetically heterogeneous datasets. We used this method to sub-divide two published datasets onto biologically and clinically meaningful clusters and to identify novel descriptor gene group characteristics for clusters. Both the techniques and the results are essentially different from what was published earlier. The second method was designed for functional comparison of a series of microarray expression experiments on close genetic backgrounds such as cancer cell lines or inbred animal strains. Also, we analyzed the published prognostic "gene signatures" using networks and pathway tools. In phase II, we propose to 1) validate sequential clustering techniques, expression of bi-modality, and the identified sets of descriptor genes in a novel study on microarray gene expression in invasive breast cancers; 2) complete functional analysis of all published processed data (gene signatures & sets); 3) build an integrated data mining environment for breast cancer research dubbed MetaMinerBC. In Phase II, we propose to conduct a novel microarray gene expression study of invasive breast cancers, complete functional analysis of conditional gene signatures, and build a data mining suit for breast cancer research. The aims are based on new data mining methods and expression markers developed in Phase I. [unreadable] [unreadable] [unreadable]