We have established that human genome sequences encoding a novel protein domain, DUF1220, are highly amplified in the human lineage (>200 copies vs. 1 in mouse/rat) and may be important in human-specific cognitive function. The majority of DUF1220 domains are located at 1q21.1, one of the most complex regions of the human genome and filled with gaps and segmental duplications. Copy number variations (CNVs) in the 1q21.1 region have now been implicated in numerous diseases associated with cognitive dysfunction, e.g. autism, mental retardation, schizophrenia, microcephaly and macrocephly. These findings may be indicative of a novel recurrent rearrangement and reflect a new cognition-related syndrome specific for the 1q21.1 region. In order to more precisely identify the CNV boundaries and causal disease genes in these patients, a haploid BAC library will be used to generate a finished sequence map of the region. Sequencing of 1q21.1 BACs from a Hydatiform (haploid) mole library will be carried out in collaboration with Dr. Rick Wilson at the Washington Univ. at St. Louis Genome Center to generate a single haplotype path across the region. The finished 1q21.1 sequence will be used for fine mapping of already identified disease-associated CNVs for autism, mental retardation, microcephaly and macrocephaly, through collaborations with the laboratories of Drs. Evan Eichler and James Lupski. High-density custom tiling arrays will be generated for the finished 1q21.1 region and used for array CGH to fine map CNV breakpoints in these patients and identify candidate genes. In addition the role of DUF1220 domain copy number in autism will be investigated by QPCR analysis of individuals with autism using DUF1220-specific primers. To investigate the function of DUF1220 domains in a living mammal, DUF1220-minus mice we have generated (the first animal model for DUF1220 function) will be subjected to behavioral testing to assess the affect of DUF1220 domain loss on learning and memory. PUBLIC HEALTH RELEVANCE: In addition to providing the scientific community with the most complete genome map for this complex genomic region, these studies should lead to a better understanding of which genes in the 1q21.1 region, including those encoding DUF1220 domains, underlie the specific diseases of cognition that have been shown to be associated with CNVs in this region. In addition, the behavioral studies using DUF1220-minus mice should generate the first insights into the possible cognitive function of these domains in a living mammal.