Personalized medicine aims to tailor medical care to an individual's need through recognition of biological diversity. Given the variety of high-throughput molecular technologies that can characterize an individual's DNA, RNA and protein from samples of tissue and blood, the promise of producing a panel of biomarkers that will dictate individualized patient care is fueling tremendous advances in biotechnology. However, limitations to these approaches include the need for invasive biopsy, and the fact that biopsies only sample small portions of generally heterogeneous lesions. Biopsies therefore do not completely characterize the molecular profiles of tumors or their anatomical, functional and physiological properties, such as size, location, morphology, vascularity, diffusion and perfusion patterns, oxygenation, and metabolic state. In light of this intrinsic challenge, we propose to change the paradigm of molecularly-based personalized medicine from one relying on characterizing tissue samples alone, to one inclusive of, or even based on, characterization of image features of entire tumors and their surroundings in non-invasive medical imaging examinations. To this end, our over-arching goal is to develop tools and technologies that integrate imaging and genomic data, thereby allowing mapping of the relationships between the two ("image-omics" map). To focus and lend immediate significance to our efforts, we will concentrate on a single disease: non-small cell lung carcinoma (NSCLC), the leading cause of cancer death with an overall 5-year survival rate of 16% that has not changed appreciably over the past 15 years. Accordingly, (1) we will develop, validate and make publicly available, controlled vocabularies and software tools to be used in building databases with vectors that quantitatively describe features of human tumors in CT and PET images. (2) We will create and make publicly available a novel multidimensional database that integrates these features of CT and PET images with clinical and gene expression microarray data of excised tumors from 400 patients with NSCLC. (3) We will demonstrate the utility of the integrated imaging/genomic/clinical database, by (a) implementing bioinformatics approaches that create an association map from CT and PET image features and clinical data to gene expression, and (b) discovering prognostic signatures that incorporate imaging, gene expression and other clinical data. While specifically developed and validated for CT and PET images of lung cancer, our tools will be extensible to other modalities and disease scenarios. Specific outcomes, potentially impacting hundreds of thousands of patients diagnosed with lung cancer each year, will include (i) a new multidimensional prognostic signature that combines gene expression, imaging features and other clinical variables, potentially generating new insights into the understanding of NSCLC biologic diversity and its clinical management, and (ii) the ability to predict a clinically-relevant molecular phenotype from imaging data alone, which may eventually assist in molecularly- targeted therapeutic decisions without requiring invasive biopsies. PUBLIC HEALTH RELEVANCE: This project has major relevance for human health. The demonstration project in non-small cell lung cancer promises to provide an improved prognostic signature that integrates well-annotated and reproducible medical feature characterizations of CT and PET images with genomic tissue profiles and other existing clinical data. Over the long term, tools we develop for the integration of medical imaging and genomic data have the potential to improve our knowledge of the biology of the disease, and to improve patient care by generating fewer biopsies and converging more rapidly to optimal management/treatment.