The proposed research aims to support and improve effective access to the biomedical literature, by utilizing the rich, highly-informative image data within publications, in addition to text. The biomedical literature is expanding at a rate of about 1,000,000 new publications a year. Scientists and physicians, as part of their daily work, go through a myriad of publications searching for relevant information. The task is even more arduous for scientific database curators (bio- curators, in organizations such as FlyBase or UniProt), who have to identify the literature most relevant to the database area, locate within it high-quality evidence concerning genes, proteins, organisms, or disease, and curate the findings within a database entry, with references to the relevant literature. Notably, much of the evidence within publications lies in figures. Accordingly, images are used by scientists and database curators as indicators for relevance. To assist and expedite the search for information within the literature, automated text-mining tools are being developed; still, several shared tasks and competitive challenges demonstrated that the need for more effective automated identification of relevant information in biomedical publications remains a bottleneck for bio-curation and for scientific discovery. While image analysis within and outside the biomedical domain is an active research area, most current work on biomedical image processing focuses on retrieval and understanding of images as a primary form of data. Likewise, most efforts on biomedical literature retrieval and mining focus on text alone. Little has been done so far to use images within publications, which provide important cues as to the relevance of information embedded in papers. The hypothesis underlying our proposal is that useful information can be derived directly from images within publications and integrated with text-based methods, leading to improved identification of relevant publications and of informative portions within them. The proposed research comprises extensive comparative study of highly-informative features within images, development and identification of such image-features, development of tools that extract such features and information from images, and integration of image-based information into the textual articles-classification process, aiming to determine the publications' relevance to well-defined biomedical needs. The fundamental research tasks we shall address are: A) Identification and comparative study of useful features for image-representation, focusing on their utility for specific biomedical needs; B) Classification of biomedical images and biomedical documents based on image-data; C) Document classification through integration of text- and image-based classifiers. To ground the research in genuine needs, secure access to much image data, and ensure broad-applicability of the results, we shall work within three diverse areas for which we have secured access to expertise and data: Finding articles about cis-regulatory regions (Cyrene project at Brown University); Evidence for gene expression in the mouse (Jackson Lab's GXD); Experimental evidence for protein-protein interaction (Delaware's Protein Information Resource). The successful completion of the proposed project will provide integrated methods and tools, utilizing both image-based and text-based features, leading to more focused and effective retrieval and mining tools, thus better supporting data-intensive biomedical discovery.