Abstract: Drug discovery is challenging and expensive, especially in the phase of lead optimization, where the activity of a lead molecule to a target is optimized. Computational approaches using all-atom physics-based molecular dynamics (MD) simulations can add substantial value, such as computing relative binding free energies (RBFE) between congeneric molecules to greatly expand the explorable chemical space or simulating long timescale motions of protein targets to understand the relationship between structure and biological function through dynamics. Silicon Therapeutics (STX) has proven the capabilities of its physics-based simulation platform by finding small molecules to modulate the biological function of challenging disease targets that have no known small molecule modulators. However, current state-of-the-art GPUs can only run a few hundred nanoseconds of MD per day, which is insufficient for many important protein targets. Here, we propose to address the timescale problem by accelerating MD with an FPGA-based cluster (F-cluster). Our preliminary work has shown that MD on clusters of FPGAs approaches the performance of proprietary (and much more expensive) ASIC-based clusters for several core functions, and that a 64-node FPGA-accelerated cluster can simulate a 50K particle system at a rate of over 10s per day ? 20x that of a commodity cluster of any size. Our overall goal is to create a commercial-quality F-cluster for efficiently running long-timescale MD simulations in our drug discovery projects to develop small molecule drugs for previously undruggable proteins. The need for long timescale MD simulations is based on the number of high-value protein targets of biological interest that are considered to be ?undruggable?. While the human genome encodes 20,000 proteins, less than 15% are considered druggable and only 2% have ever been targeted by a drug. While many biologically attractive targets exist, the chemical strategy to drug them is generally not clear and/or highly challenging, such as inhibiting protein-protein interactions, biasing signaling, and allosterically modulating activity. These cases require an understanding of the relationship between ligand binding and protein conformational changes. Long timescale MD can address critical challenges associated with predicting key motions of these important protein targets, and the F-cluster proposed in this work is the most cost-effective approach to achieving these ends. FPGAs are reprogrammable silicon chips that are ideal for MD because the hardware adapts to the application (rather than the reverse) and the fast interconnections across multiple chips (low latency and high bandwidth). In the proposed work, we will develop optimized MD code so that the performance on a single FPGA chip is on- par with GPUs. We will then improve communication between FPGAs so MD can be scaled to many FPGAs in a cluster. We expect near-linear scaling for up to hundreds of FPGAs, based on our detailed analysis described in the proposal. Once the F-cluster is completed, we will target specific proteins in our drug discovery efforts where simulating long timescale motions is needed to understand structure-dynamics-function relationships.