Medical and biological data often come in the form of digitized signals and images; for example, gene expression microarrays, mass spectrograms, and flow cytometry cell plots. As instrumental data acquisition becomes routine, sequences of such images, signals or paths are collected, often along with other covariate measurements, resulting in datasets where the basic unit of measurement, or response, is a very high-dimensional object. The gene microarray is a leading example of how new technology has led to data acquisition on a massive scale. The project continues to focus on developing techniques for modeling and understanding such data that naturally adapt to the high dimensionality. For studying genomic divergence of bacterial strains using comparative genomic hybridization, we propose latent variable models that incorporate a statistical method called the "fused lasso", to jointly model the CGH measurements from the bacteria. For flow cytometry analysis of cancer cells, we propose a method for identifying new sub-populations that have emerged after stimulation of the cells. We also propose to develop and study techniques for prediction and clustering for high-dimensional data. Much of this work will be carried out in existing and new collaborations with researchers in medicine and biology, working for example in cancer and auto-immune diseases. Project Narrative: This work can potentially improve the understanding, diagnosis and prognosis of human diseases such as cancer, heart disease and AIDS, and hence can help to improve the overall quality of public health of the U.S. [unreadable] [unreadable] [unreadable] [unreadable]