Long-term objectives of this research include identification of the gene(s) or genomic rearrangements on chromosome Xq27 that are involved in predisposition to prostate cancer, and insight into the functions of SPANX in normal and malignant cells. A potentially far-reaching objective is to define the structural requirements for de novo kinetochore formation in human cells. Success in achieving this goal would be an important step towards developing HAC-based gene therapy and/or gene delivery systems for human disease. 1) Genetic linkage studies implicate a gene or genes at Xq27 in prostate cancer susceptibility. The corresponding region spans 750 kb and includes five SPANX genes (SPANX-A1, -A2, -B, -C, and -D), which encode proteins that are expressed in sperm nuclei and a variety of cancer cells. The SPANX genes at Xq27 are highly conserved (i.e., >95% homology) and reside within large segmental duplications (SDs), which complicates mutational analysis of these genes by PCR-based methods. However, we recently succeeded in performing mutational analysis of Xq27-linked SPANX genes from prostate cancer patients using transformation-associated recombination (TAR) to clone and sequence these genes. This analysis revealed frequent gene deletion/duplication and homology-based sequence transfers involving SPANX genes at Xq27, suggesting that SD-mediated homologous recombination involving the SPANX genes might lead to increased genetic instability and possibly to a higher level of genetic diversity in SPANX genes. The results of mutational analysis showed that no DNA sequence variation or genetic haplotype in the SPANX gene cluster was associated with susceptibility to prostate cancer. However it remains possible that Xq27-linked prostate cancer susceptibility is related to variation in the architecture of the SPANX-A/D gene cluster. We recently identified a second family of SPANX genes, SPANX-N, in the human genome. The SPANX-N gene cluster includes SPANX-N1, SPANX-N2, SPANX-N3, SPANX-N4 and SPANX-N5. SPANX-N proteins share 50-80% identity with each other and 40-50% identity with the SPANX-A/D proteins. Given the proposed role of SPANX genes in spermatogenesis and cancer, we have extended studies to SPANX-N gene evolution, variation, regulation of expression, and intra-sperm localization. We developed a set of SPANX-specific antibodies that recognize SPANX-A/D or SPANX-N proteins. Immunofluorescence studies showed that SPANX-N proteins, like SPANX-A/D proteins, localize exclusively to post-meiotic spermatids. However they localize to the acrosome instead of the nuclear envelope, and they are expressed at a low level in several non-gonadal adult tissues as well as in many cancer cells. Thus, our findings are consistent with the possibility that the recent duplication of the SPANX genes was accompanied by diversification of the function of the SPANX proteins in hominoids, leading to differential localization of SPANX-N and SPANX-A/D proteins in post-meiotic sperm and differential expression of SPANX-N proteins in non-gonadal adult tissues. Additional study of SPANX genes and gene products is likely to yield valuable insight into spermatogenesis and carcinogenesis as well as genomic stability and diversifying selection. 2) The structural requirements for de novo kinetochore formation are being analyzed using HAC constructs carrying synthetic alphoid DNA arrays. Our novel RCA-TAR method (7) for constructing these arrays exploits in vivo recombination in yeast to generate synthetic alphoid DNA arrays up to 120 kb in length from defined oligomer substrates. Using these alphoid DNA-containing constructs, we are pursuing answers to the following questions: a) what is the minimal length of alphoid DNA required for the seeding of CENP-A chromatin and stable maintenance of a functional kinetochore; b) what is the structural and/or functional role of the CENP-B binding sites in alphoid DNA; and c) what is the role of vector sequences in a HAC formation. To determine how much CENP-A chromatin is required to form and maintain a functional human centromere, alpha-satellite DNA arrays in different lengths were constructed and tested on their ability to nucleate CENP-A chromatin and form a HAC with a functional kinetochore. Our results suggest that a minimum core maintained on 30-70 kb alphoid DNA arrays represents an epigenetic memory of centromeric chromatin. A possible contribution of a vector sequence, that presents obligatory in all alphoid DNA constructs, on HAC formation was also investigated. Our recent chromatin immunoprecipitation (ChIP) analysis showed that while alphoid DNA promotes assembly of CENP-A chromatin and chromatin containing di-methylated histone H3, the vector-derived Neo transcriptional cassette promotes assembly of H3K4me2 and H3K4me3 chromatin (characteristic histone modifications in euchromatin) as well as H3K9me3 chromatin (characteristic of heterochromatin). Thus, the vector-derived transcriptional cassette may have a significant impact on chromatin structure and kinetochore function in HACs. Using the RCA-TAR method we have constructed a novel human artificial chromosome (HAC) to manipulate the epigenetic state of chromatin within an active kinetochore. The HAC has a dimeric alpha-satellite repeat containing one natural monomer with a CENP-B binding site, and one completely artificial synthetic monomer with the CENP-B box replaced by a tetracycline operator (tetO). Targeting of several tetracycline repressor (tetR) fusions into the centromere had no effect on kinetochore function. However, altering the chromatin state to a more open configuration with the tTA transcriptional activator or to a more closed state with the tTS transcription silencer caused mis-segregation and loss of the HAC. tTS binding caused the loss of CENP-A, CENP-B, CENP-C and H3K4me2 from the centromere accompanied by an accumulation of histone H3K9me3. The HAC opens a new spectrum of opportunities for the systematic manipulation of the histone code within the kinetochore, and definition of the full epigenetic signature of centromeric chromatin. The new HAC with a conditional centromere has also a great potential as a system for gene delivery and regulated gene expression in mammalian cells. Centromeric and pericentromeric regions of mammalian chromosomes contain repetitive DNA sequences that exhibit a high rate of evolutionary changes. The exact role of these sequences with respect to kinetochore/heterochromatin structure and function is not understood yet. We recently investigated the effect of several repetitive elements on expression of a transgene cassette stably integrated into a mouse chromosome in mouse erythroleukemia cells. Our results show that human gamma-satellite DNA derived from the pericentromeric region of human chromosome 8 prevents epigenetic silencing of an eGFP reporter gene induced by a vector DNA. Electrophoretic mobility shift assay showed the presence of CTCF binding sites in the gamma-satellite arrays from human chromosomes 8, X and Y. We also showed that these sites are protein-bound independently of their methylation status. Chromatin immunoprecipitation experiments confirmed that CTCF binds these sites in vivo. Given the discovery of gamma-satellite DNA in most human chromosomes, these data suggest that pericentromeric gamma-satellite DNA plays a role in maintaining a mosaic structure in human centromeric-associated chromatin and may prevent heterochromatin spreading into and/or beyond the pericentromeric region. A strong anti-silencing activity of gamma-satellite DNA also suggests that these sequences may be useful to promote stable expression of ectopic transgenes inserted into different chromosomal locations as well as in a HAC