Project Summary Despite their many bene?ts and common use, protein-based drugs can elicit serious adverse side-effects. Many of these side-effects are believed to be caused by small protein aggregates. One tool for measuring these particles in a high-throughput fashion is Micro?ow-Imaging (MFI). MFI is commonly used in both academia and industry to characterize subvisible particles (those 25 m in size) in protein therapeutics. In many formulations protein aggregates that are 10 m in size account for upwards of 90% of the protein aggregates in the product. Subvisible protein aggregates have been demonstrated to correlate with adverse drug responses; however, which speci?c protein aggregates induce immunogenic responses remains unknown. Pharmaceutical companies are required to record and catalog vast volumes of FIM data on protein therapeutic products, but are only mandated under FDA regulations (i.e., USP 788 ) to control the number of particles exceeding 10 and 25 m in delivered products. Hence a vast amount of digital images are available to analyze. Recent studies have indicated that some of the factors correlated with adverse drug responses are encoded in MFI image data. Current state-of-the-art MFI analysis methods rely on a relatively low-dimensional list of ?morphological fea- tures? to characterize particles, but these methods ignore an enormous amount of information encoded in the ex- isting large digital image repositories. Deep Convolutional Neural Networks (CNNs or ?ConvNets?) have demon- strated the ability to extract predictive information from raw macroscopic image data without requiring the selection or speci?cation of ?morphological features? in a variety of tasks. However, the heterogeneity, polydispersity of pro- tein therapeutics, and optical phenomena associated with subvisible MFI particle measurements introduce new challenges regarding the application of CNNs to MFI image analysis. This proposal will spring from state-of-the-art deep CNN methods to provide new analysis tools capable of reliably analyzing and classifying heterogeneous MFI protein therapeutics data. The envisioned software product, capable of processing images from both of the leading manufacturers of MFI instruments (Fluid Imaging Inc. and ProteinSimple), will provide high-throughput, data-driven models that ef?ciently capture information encoded in the large collection of image data, avoiding the need to de?ne ?features? a priori and is anticipated to provide a paradigm shift to the MFI quanti?cation ?eld. We anticipate that the proposed algorithms and software will help in correlating which protein aggregates induce adverse side-effects and will also serve as a useful process monitoring tool.