Dissecting the transcriptional regulatory networks is essential for understanding development and the molecular basis of many diseases. Great progress has been made and the emerging view is that the presence of individual regulatory elements is rarely sufficient to explain spatial-temporal specific gene expression and regulatory elements usually are organized into functional units - modules. Modules control gene expression in a particular context independent of its position and orientation. Experimental identification of modules is often a laborious and expensive process. Computational approaches can be fast and inexpensive, however, the development of computational methods to identify modules is still in its infancy. The goal of the proposed research is to use C. elegans as a model system, to develop and validate computational strategies to identify regulatory modules in the genomic sequences. First, the regulatory region of a set of genes that are preferentially expressed in the muscle tissue of C. elegans, together with that of the orthologous genes in related species will be used to identify muscle-specific regulatory motifs. Several different computational approaches and various existing computational tools will be employed. Next, statistic analysis will be used to exploit the enrichment of certain combinations of motifs in muscle specific genes in comparison with the genome at large and to analyze the interactions among motifs. This will provide insight into the rules that govern the organization of cis-regulatory elements to form biologically active modules. The information will be used to develop computational tools to identify modules that control muscle - specific transcription. To validate computational predictions in vivo, GFP reporter gene constructs will be used to determine whether a gene is expressed in muscle tissue and to test putative regulatory modules. The results of the validation experiments will be used to refine and improve our algorithms. Although I will focus my efforts on muscle specific gene expression, I believe the approaches and tools I develop will be of general use for many other context-specific module identification. The computational tools I develop will be made freely available to the scientific community.