DESCRIPTION: (Applicant's Description) The single aim of this proposal is to identify 3000 BACs distributed at a density of ~ 1 per Mbp across the human genome to serve as tools for cytogenetic analyses in situ and in silico. To accomplish this aim, we will first FISH-map >3000 BACs derived from two IRB-approved libraries. The majority will be selected randomly from the set of BACs whose end-sequences will be placed on radiation hybrid maps through the efforts of our collaborators (Adams, Venter, Hood, and Cox). By FISH-mapping these BACs, we will identify at least one clone for each of 600 cytogenetic bands, or 1 per 5 Mbp. These cytogenetic markers will be selected for their robust and efficient performance in FISH assays. Importantly, this process will tightly integrate the cytogenetic and RH maps. As a consequence, it will be a simple exercise to select BACs situated at l-Mbp intervals from the 30,000 that will be positioned on the RH maps within the next 3 years. Thus, the desired set of BACs, spaced at 1 Mbp and containing RH-mapped STSs, will be identified without extensive library screens. All the BACs will be part of the STC (BAC end-sequence) resource, the basis of a strategy to sequence the genome efficiently and cost effectively. As a consequence, their positions in the final sequence will be known precisely allowing researchers to readily obtain overlapping clones and sequence of rearranged regions. Our proposal includes plans to streamline the FISH procedure, to distribute cytogenetic location-data via several public databases, and to distribute two sets of cytogenetic markers, a 1 -per-band set and a l-per-Mbp set, through a collaboration with Research Genetics. After assembly, both sets will be quality-controlled by FISH analyses of subpools. Our strategy also has the added benefit that critical information will be obtained on the randomness and chimerism frequency of the libraries that are the foundation of large-scale efforts to sequence the human genome. In addition, the random FISH survey we propose is of sufficient scale to estimate the frequency and distribution of large low -copy duplications, which are features of the genome that are likely to be involved in the generation of chromosomal rearrangements, associated with the evolution of gene families, and a challenge to completing the sequence of the human genome.