PROJECT? ?SUMMARY Scaling extracellular electrophysiology to higher channel counts is hindered by the burden of data handling,storage, and especially preprocessing, e.g. spike sorting ?[1]?. The burden of spike sorting can in principle be reduced through a combination of high-density multielectrode array (probe) technology and algorithm optimization to yield a spike sorting method that is both highly accurate and fully automated [2?4]?. With a known-good spike sorting method in hand, the algorithm can be baked into the data stream as early as possible to allow for automatic data sorting and a massive reduction in data rate to downstream storage and processing. However, it takes an investment of considerable resources to implement this sort of large-scale real-time processing, and great confidence to throw away raw data and keep? ?only? ?processed? ?data. Accuracy and automation of spike sorting increases with the spatial density of recording sites ?[5?7]?. Neural activity recorded from high-density probes can serve as a data corpus for testing the accuracy of spike sorting algorithms. However, to quantify spike sorting performance for comparison between algorithms, the ground truth spiking activity of neurons captured in the data corpus must be measured, such as by simultaneously recording via patch-clamp pipette or some other recording modality ?[8?10]?. Unfortunately, because ground truth recordings are so challenging to perform, they remain too rare to allow for this sort of analysis in a large-scale, meaningful way ?[11]?. Until this need is met, spike sorting development lacks a compass, and cutting-edge techniques such as supervised machine learning which require large amounts of labelled data remain out-of-reach ?[12]?. ?Accordingly, we propose a series of multimodal neural recordings combining multielectrode array and patch pipette techniques to generate? ?a? ?corpus? ?of? ?ground? ?truth? ?data? ?for? ?validation? ?of? ?spike? ?sorting? ?algorithms.