We report on several specific projects that are related to the DNA sequence-dependent structural characteristics, important for interactions with proteins (including bacterial gal-repressor and human p53), higher order self-organization of genomic DNA, and gene regulation. 1. DNA looping in prokaryotes Stabilization of the multi-subunit protein-DNA complexes is facilitated by DNA looping. One of the best characterized is the gal loop in E. coli, involved in regulation of the gal operon. To determine the optimal trajectory of the DNA loop in such a complex structure, we used the knowledge-based elastic model suitable for large-scale DNA simulations (developed by us earlier). As a result, we found that the "antiparallel" gal loop is energetically more favorable than the "parallel" one. The same trend was found for the DNA loop formed upon binding of the lac repressors to DNA. Based on these computations, we designed detailed experiments to visualize the 3D organization of the DNA loops in bacteria. The atomic force microscopy supports the "antiparallel" DNA looping, both with the gal and lac repressors. These results imply that the "antiparallel" DNA looping may be a general feature of the condensed bacterial nucleoid, as opposed to the parallel DNA "wrapping" around histones in eukaryotic chromatin. Importantly, the regular DNA folding in prokaryotes is consistent with the periodic distribution of the curved A-tracts in bacterial genomes, described below. 2. Distribution of A- and G-tracts and DNA packaging in pro- and eukaryotes Periodic positioning of the A- and G-tracts in DNA causes DNA curvature in solution and facilitates its bending in the complexes with proteins. Here, we analyzed distribution of these sequences in the pro- and eukaryotic genomes. We found that distribution of the strongly bent A-tracts (4-7 bp) in the prokaryotic genomes demonstrates a remarkable periodicity of 10-11 bp. Such a periodicity may reflect the intrinsic propensity of prokaryotic DNA to form the loop-shaped structures. Based on these data and by analogy with the "gene repression" gal- and lac-loops in E. coli, we hypothesize that the loop folds with the structural period of 100 bp may be elementary units of the prokaryotic nucleoid packaging. This hypothesis was tested by the micrococcal nuclease digestion of bacterial nucleoids (in collaboration with S. Adhya, NCI). The results show that the 100 bp DNA fragments are highly overrepresented in digestion products, thereby implying a highly specific nucleoid packaging, with the DNA structural period of 100 bp. On the other hand, the G-tracts are underrepresented in prokaryotic genomes. In contrast, both the A- and G-tracts of all lengths are highly overrepresented in eukaryotic genomes. However, the "optimal" A-tracts (4-7 bp) do not reveal the 10-11 bp periodicity. Apparently, the intrinsic curvature of DNA, caused by the A-tracts, is not a necessary prerequisite for the formation of nucleosomes. Rather, the overabundant long purine (A and G) runs observed in eukaryotic genomes may serve as the "chromatin organizers," decreasing the DNA propensity for the formation of nucleosomes, especially in the promoter regions. We are currently studying the G-tract clustering in the CpG islands, in particular in the vicinity of the cancer-related genes' promoters. 3. Genome-wide distribution of p53 sites in human DNA The tetrameric p53 binding to DNA plays a key role in tumor suppression. In response to DNA damage and other types of cellular stress, the p53 protein becomes activated and binds DNA sequence-specifically, functioning as a transcriptional factor or cell cycle regulator. p53 is unique in regulating a wide spectrum of genes: thousands of human genes are either activated or repressed by p53. Normally, the p53 tetramer binds to DNA response elements, consisting of two decamers RRRCWWGYYY (half-sites) separated by a spacer. (The length of the spacer, S, varies from 0 to 14 bp in the known functional binding sites, but in most cases S=0 or 1.) How many putative p53 binding sites, consistent with this scheme, are there in the human genome? What is the distribution of the spacer lengths? With the human genome sequence we can directly answer these questions. The distribution of spacers proves to be extremely nonuniform in all human chromosomes, with strong peaks in the profile, exceeding the average background 3-4 fold. The peaks at S=0 and 10 bp, and the gap at 4-5 bp are consistent with our earlier computer modeling and electrophoresis measurements, indicating the lateral positioning of the p53 core domains on the outer side of the DNA loop. In general, these data agree with the idea that the p53 tetramer can bind DNA specifically without unwrapping nucleosomes in the course of transcriptional activation of the chromatin-assembled genes. Currently, we are exploring localization of the putative p53 sites with respect to the starts of transcription. Our data indicate strong difference between the up- and down-regulated genes in terms of distribution of the p53 sites in the vicinity of genes. The up-regulated genes are characterized by a twofold higher occurrence of the p53 sites within 1 Kbp from the start of transcription, compared to the down-regulated genes. In addition, the down-regulated genes reveal a higher fraction of the p53 sites with "unusual" spacer S=3, shown earlier to repress transcription of the corresponding genes. These are extremely important observations, as they can be used for prediction of the p53-activated and repressed genes. Summarizing, distribution of p53 sites in the human genome reflects the versatility of p53 binding and its tumor suppressor functions.