Building on 8 years of highly productive work in technology development that included the creation of the Colorado Richly Annotated Full Text corpus (CRAFT), we hypothesize that text mining resources and methods are approaching the level of maturity required to productively process a significant proportion of the full text biomedical literature to create a well-represented formal knowledge base of molecular biology. We propose a detailed, integrated plan to achieve this long-standing goal. Success in this effort will make possible a transformative new way for the biomedical research community to identify access and integrate existing knowledge, breaking down disciplinary boundaries and other silos that have kept scientists from fully exploiting relevant prior results in their research. Our successes in the prior funding period broadened the applicability of biomedical concept identification systems to a much wider set of tasks, demonstrating the ability to target multiple community-curated ontologies in text mining, and generate scientifically significant insights from the results. The proposed work would take advantage of the resources we produced to transcend several of the limitations of previous efforts. We propose innovative new approaches to formal knowledge representation and to characterizing relationships between textual elements and semantic content. We will design, implement and evaluate computational systems that have the potential to transform enormous text collections into semantically rich, logic-based, standards-compliant, formal representations of biomedical knowledge with clearly identified provenance. The resulting representations will express complex assertions about a very wide range of entities, processes, qualities, and, most importantly, their specific relationships with one another.