A fundamental question in biomedical data analysis is how to capture biological heterogeneity and characterize the complex spectrum of health states (or disease conditions) in patient cohorts. Indeed, much effort has been invested in developing new technologies that provide groundbreaking collections of genomic information at a single cell resolution, unlocking numerous potential advances in understanding the progression and driving forces of biological states. However, these new biomedical technologies produce large volumes of data, quantified by numerous measurements, and often collected in many batches or samples (e.g., from different patients, locations, or times). Exploration and understanding of such data are challenging tasks, but the potential for new discoveries at a level previously not possible justifies the considerable effort required to overcome these difficulties. In this project we focus on multi-sample single-cell data, e.g., from a multi-patient cohort, where data points represent cells, data features represent gene expressions or protein abundances, and samples (e.g., considered as separate batches or datasets) represent patients. We consider a duality or interaction between constructing an intrinsic geometry of cells (e.g., with manifold learning techniques) and processing data features as signals over it (e.g., with graph signal processing techniques). We propose the utilization of this duality for several data exploration tasks, including data denoising, identifying noise-invariant phenomena, cluster characterization, and aligning cellular features over multiple datasets. Furthermore, we expect the dual multiresolution organization of data points and features to allow us to compute aggregated signatures that represent patients, and then provide a novel data embedding that reveals multiscale structure from the cellular level to the patient level. The proposed research combines recent advances in several fields at the forefront of data science, including geometric deep learning, manifold learning, and harmonic analysis. The methods developed in this project will provide novel advances in each of these fields, while also establishing new relations between them. Furthermore, the challenges addressed by these methods are a foundational prerequisite for new advances in genomic research, and more generally in empirical data analysis where data is collected in varying experimental environments. The developed algorithms and methods in this project will be validated in several biomedical settings, including characterizing Zika immunity in Dengue patients, tracking progress of Lyme disease, and predicting the effectiveness of immunotherapy. RELEVANCE (See instructions): In this project we will develop new algorithms for biomedical data analysis that will characterize the complex spectrum of health states across various patient populations. These algorithms will leverage the large volumes of data collected by new biomedical technologies, focusing on single-cell data. Specific analysis will be carried out for characterizing Zika immunity in Dengue patients, tracking progress of Lyme disease, and predicting the effectiveness of immunotherapy.