Medical and biological data often come in the form of digitized signals and images; for example, mass spectrograms, electrocardiogram traces, human gait cycles, and even the representation of gene expression arrays. As instrumental data acquisition becomes routine, sequences of such images, signals or paths are collected, often along with other covariate measurements, resulting in datasets where the basic unit of measurement, or response, is a very high-dimensional object. The gene microarray is a leading example of how new technology has led to data acquisition on a massive scale; we also expect to work with more direct protein measurements obtained through mass spectrometry. The project continues to focus on developing techniques for modeling and understanding such data that naturally adapt to the high dimensionality. For regression and classification with gene expression arrays, we consider methods that are a subtle blend between univariate and multivariate, that offer both good prediction and gene selection. To study covariance structure, the project continues to develop "sparse" forms of principal components and discriminant analysis that may be more sensitive to either local phenomena of not necessarily smooth form or that are more adapted to irregularly observed data. [unreadable] [unreadable] Corresponding quadratically regularized methods in appropriate bases form a natural foil for comparison, and inference procedures for some of these are proposed. For estimation of means, the project will examine sparse empirical Bayes and False Discovery Rate methods for estimating non smooth local phenomena. Much of this work will be carried out in existing and new collaborations with researchers in oncology, genetics, cardiology and other specialties, working for example on cancer, heart disease and human locomotion. [unreadable] [unreadable]