The aim of this proposal is to implement a novel way of processing and accessing the vast detailed knowledge contained within collections of scientific publications on the regulation of transcription initiation in bacterial models. In princple, this model for processing and reading information and new knowledge is applicable to other biological domains, potentially benefiting any area of biomedical knowledge. It is certainly criticl to generate new strategies to cope with the ever-increasing amount of knowledge generated in genomics and in biomedical research at large. Improving the efficiency of the traditional high-quality manual curation of scientific publications will enable us also to expand the type of biological knowledge, beyond mechanisms and their elements in the genome, to start including their connections with larger regulated processes and eventually physiological properties of the cell. We will first implement the necessary technology to improve our curation by means of a computational system that has text mining capabilities for preprocessing the papers before a human expert curator identifies which sentences contain the information that is to be added to the database. Premarked options selected by the curators will accelerate their decisions. The accumulative precise mapping between sentences and curated knowledge will provide training sets for text mining technologies to improve their automatic extraction. The curator practices will become more efficient, enabling us to curate selected high-impact published reviews to place mechanisms into a rich context of their physiological processes and general biology. Another relevant component of our proposal is the improved modeling of regulated processes by means of new concepts in biology that capture larger collections of coregulated genes and their concatenated reactions. Starting from all interactions of a local regulator, coregulated regulators and their domain of action will be incorporated to construct the biobricks of complex decisions, as they are encoded in the genome. These are conceptual containers that capture the organization of knowledge to describe the genetic programming of cellular capabilities. These proposals will be formalized and proposed within an international consortium focused in enriching standard models or ontologies of gene regulation for use by the scientific community. Finally, a portal to navigate across all the sentences of a given corpus of a large number (more than 5,000) of related papers will be implemented. The different avenues of navigation will essentially use two technologies, one dealing with automatically generating simpler sentences from original sentences as input, and the other one with the classification of papers based on their theme or ontology. Their combination will enable a novel navigation reading system. If we achieve our aims, this project will give a proof-of-principle prototype with clearly innovative higher levels of large amounts of integrated knowledge. Future directions may adapt these concepts and methods to the biology of higher organisms, including humans.