Human telomeric DNA regions are filled with segmental duplications we refer to as subtelomeric repeats. Segmental duplications (defined as duplicated DNA greater than 1 kb in length and greater than 90% sequence identity) comprise approximately 5% of euchromatic human DNA. Chromosome regions containing human segmental duplications have evolved very recently, and are susceptible to homology-driven chromosomal rearrangements associated with human disease. Importantly, it is now clear that functional genes and gene families, including newly-formed genes derived from chimeric transcripts generated by juxtaposition of new segmentally duplicated and transposed DNA segments, populate segmentally duplicated DNA regions. The 5% of the human genome comprised of segmental duplications is therefore very important functionally, and is associated with some of the newest and most rapidly-evolving chromosome regions in the entire human genome. Subtelomeric repeat sequences have complicated initial mapping, sequencing, and assembly of reference sequences for human telomeric DNA. However by combining physical mapping, half-YAC cloning, and collaborative large-scale sequencing of half-YAC derived materials (as well as IHGSC sequencing of independently-derived overlapping and adjacent BACs and cosmids) it was possible to acquire and validate a finished "reference" sequence from each of the 41 euchromatic telomere regions (by the spring of 2003). The telomeric end of the reference sequence will extend into the terminal (TTAGGG)n tract (for reference alleles of approximately 14 of 41 telomeres) and into subtelomeric repeat sequences within a relatively small distance of the (TTAGGG)n tracts of one reference allele (approximately 21 of 41 telomeres). However, differential subtelomeric repeat content and organization at specific telomeres contribute to remarkable large-scale variations seen in human subtelomeric regions. These variations are detectable as chromosome length polymorphisms ranging from a few kb to greater than 300 kb at a given telomere. The global complement of subtelomeric alleles in a given individual will determine the composition and dosage of functional genes embedded in the subtelomeric repeats as well as the positions of each of these genes (and the positions of adjacent 1-copy genes) relative to terminal (TTAGGG)n tracts. Both gene dosage and gene distance from terminal (TTAGGG)n tracts may have important consequences for expression in gene-rich subtelomeric regions, and depending upon subtelomeric gene functions and the extent of potential telomere position effects in humans, could contribute substantially to both natural human phenotypic variation and to disease phenotypes. Most variant subtelomeric chromosome segments are not yet represented in the public sequence databases, and are therefore inaccessible for further analysis of this key chromosome region. In order to close this gap, we propose to carry out a comprehensive analysis of large-scale variations in human subtelomeric regions, to clone and collaboratively sequence subtelomeric alleles carrying unique subtelomeric size variants at each telomere, and to develop PCR-based marker sets capable of distinguishing individual large-scale subtelomeric variants in the human population.