Developed an application for biological theme extraction of microarray patterns based on ontology related annotation (TEMPORA). The application takes gene expression patterns that are grouped by a clustering algorithm and tests for the over-representation (the abundance) of Gene Ontology (GO)biological processes for the genes within the clusters. Then, the application uses latent semantic indexing (a form of natural language processing) to create concepts from the relationship between Medical Subject Heading (MeSH) terms for diseases and the PubMed documents (scientific articles) that the genes in the GO biological processes of a cluster are published in. Finally, a similarity matrix is generated from the scientific articles to group the documents by hierarchical clustering. The scientific articles are labeled by gene IDs, GO biological processes and PubMed IDs so that clusters of documents with associated biological processes can be investigated based on the pattern of the expression from the genes in a given cluster. [unreadable] [unreadable] --------------------------------------------------------------------------------------------------[unreadable] [unreadable] Developed an application for generation of phenotypic prototypes (Modk-prototypes). The application uses k-mode and k-means style clustering of categorical histopathology observations and numeric gene expression and clinical chemistry data respectively to cluster biological samples into groups which share phenotypic responses to stimuli. The clustering of the samples using Modk-prototypes and all the data together performs better than clustering of the data using any one of the data domains separately or pairwise combinations of the data.[unreadable] [unreadable] --------------------------------------------------------------------------------------------------[unreadable] [unreadable] Developed a software for extracting gene expression patterns and identifying co-expressed genes (EPIG). Through evaluation of the similarity among profiles, the magnitude of variation in gene expression profiles, and profile signal-to-noise ratios, EPIG extracts a set of patterns representing co-expressed genes without a pre-defined seeding of the patterns. In extracting gene expression patterns, EPIG uses a filtering process where all profiles initially are considered as pattern candidates. Subsequently, EPIG categorizes each gene to the pattern, for which it has the highest similarity with the gene profile. A gene not assigned to any extracted patterns is considered an orphan if its highest similarity value is lower than a given threshold corresponding to a p-value of 0.0001 to assure the significance of the co-expression.[unreadable] [unreadable] --------------------------------------------------------------------------------------------------[unreadable] [unreadable] Developed an application for gene selection and multiclass prediction of biological samples. The application uses a multiclass kernel for selecting the most informative genes to predict samples with a high degree of accuracy.[unreadable] [unreadable] --------------------------------------------------------------------------------------------------[unreadable] [unreadable] [unreadable] Performed scanning of the mouse and human genomes to search for patterns of DNA sequence, motifs and restriction enzyme locations. Analyzed ChiP on Chip data for detection of hypermethylation of DNA sites. [unreadable] [unreadable] --------------------------------------------------------------------------------------------------[unreadable] [unreadable] Developed the MicroArray Project System (MAPS) database for more customized management of experimental information and data from microarray studies. Developed customized analytical applications for implementation into the Chemical Effects on Biological Systems (CEBS) database.[unreadable] [unreadable] --------------------------------------------------------------------------------------------------[unreadable] [unreadable] Developed a software for phase-shift analysis of gene expression (PAGE) data. The PAGE software clusters profiles of gene expression from multiple biological conditions across dose and time series experiments. Grouping of gene expression patterns is performed in intervals of the measurements using phase-shifts to find clusters of genes which share trends of expression profiles within the dataset. The PAGE method has three phases: [unreadable] Phase 1: Gene expression pattern matrix transformation into -1,0,1 to indicate the direction of expression change from each biological condition at fixed time and dose points. All biological replicates are averaged if provided. [unreadable] Phase 2: Generate clusters which have similar patterns of expression of over consecutive conditions [unreadable] Phase 3: Assign a significance score for each bicluster in all clusters and identify the inhibition patterns of each cluster.