Based on recent studies published by our group and others, it was discovered that large-scale DMA copy number variants invisible at the cytogenetic level (CNVs), are a ubiquitous characteristic of the human genome. Our findings indicated that, on average, two individuals differ by a dozen CNVs involving 3 Mb or approximately 0.1 % of the genome. This is comparable to the 0.1 % of genetic difference that is due to single nucleotide polymorphisms (SNPs). However, in contrast to nucleotide sequence variants such as SNPs, structural variation in the genome has not been well characterized. Much remains to be learned about the genomic locations, frequency, and stability of these structural variants and their importance in human evolution and genetic disease. To enable further research in this are it is necessary to expand the current knowledge of copy number variation by characterizing a large sample of individuals and constructing a database of validated CNVs. A comprehensive catalog of CNVs will facilitate large-scale studies of (1) the association of CNVs with disease risk (2) the effects of CNVs on response to drug treatment, and (3) the role of structural variation in human evolution. We propose to collect a data resource on genome copy number variation on 270 individuals from the international HapMap project using a powerful high-resolution CNV discovery method, Representational Oligonucleotide Microarray Analysis (ROMA). We will perform ROMA scans using a 380,000 probe array that provides a resolution of 8 kb. In addtion, we will integrate our data with CNV information obtain using other CNV discovery methods. We will select a set of 600 common CNVs (minor allele frequency >= 1%) for fine-scale characterization, and the boundaries of common CNVs will be defined at higher resolution using a tiling path Oligonucleotide array with a resolution of one probe every 5 bp. For a further subset of deletions and duplications, we will characterize the CNV junctions at the sequence level. Lastly, in order to integrate CNVs into the context the SNP-based HapMap, we will identify SNP markers that are in linkage disequilibrium with CNVs. All information on copy number variation will be made available through dbSNP and raw microarray data will be made available from www.hapmap.org .