ABSTRACT Relations linking various biomedical entities constitute a crucial resource that enables biomedical data science applications and knowledge discovery. Relational information spans the translational science spectrum going from biology (e.g., protein?protein interactions) to translational bioinformatics (e.g., gene?disease associations), and eventually to clinical care (e.g., drug?drug interactions). Scientists report newly discovered relations in nat- ural language through peer-reviewed literature and physicians may communicate them in clinical notes. More recently, patients are also reporting side-effects and adverse events on social media. With exponential growth in textual data, advances in biomedical natural language processing (BioNLP) methods are gaining prominence for biomedical relation extraction (BRE) from text. Most current efforts in BRE follow a pipeline approach containing named entity recognition (NER), entity normalization (EN), and relation classi?cation (RC) as subtasks. They typically suffer from error snowballing ? errors in a component of the pipeline leading to more downstream errors ? resulting in lower performance of the overall BRE system. This situation has lead to evaluation of different BRE substaks conducted in isolation. In this proposal we make a strong case for strictly end-to-end evaluations where relations are to be produced from raw text. We propose novel deep neural network architectures that model BRE in an end-to-end fashion and directly identify relations and corresponding entity spans in a single pass. We also extend our architectures to n-ary and cross-sentence settings where more than two entities may need to be linked even as the relation is expressed across multiple sentences. We also propose to create two new gold standard BRE datasets, one for drug?disease treatment relations and another ?rst of a kind dataset for combination drug therapies. Our main hypothesis is that our end-to-end extraction models will yield supe- rior performance when compared with traditional pipelines. We test this through (1). intrinsic evaluations based on standard performance measures with several gold standard datasets and (2). extrinsic application oriented assessments of relations extracted with use-cases in information retrieval, question answering, and knowledge base completion. All software and data developed as part of this project will be made available for public use and we hope this will foster rigorous end-to-end benchmarking of BRE systems.