Recent years have seen the completion of multiple vertebrate genome sequences, and revealed a surprising degree of conservation of non coding sequence;at least twice as much non-coding as coding sequence appears to be conserved between human and mouse. Conservation is a commonly used indicator of functional constraint. We hypothesize that many conserved non-coding sequences are involved in the regulation of gene expression, although we currently lack the tools to predict their function from primary sequence. A full understanding of these sequences depends on functional testing in vivo, but comprehensive analysis through mouse transgenesis is cost prohibitive and cannot readily reveal dynamic changes in gene regulation. We propose to apply transposon-based transgenesis in zebrafish, to functionally analyze a large number of conserved non-coding elements in vivo. We will initially focus on genes important in the morphogenesis of the skeleton, encoding a variety of protein products that have been implicated in development and human disease, and for which the zebrafish orthologues have been characterized. In Specific Aim 1, we will identify >200 conserved sequences associated with selected human genes, and test them for their ability, in conjunction with a minimal promoter, to drive reporter gene expression in zebrafish embryos. These experiments will create a single data set of unprecedented size, correlating primary sequence with regulatory function in a vertebrate organism. We also propose to test the degree to which reliance on evolutionary sequence conservation enriches for regulatory function. Preliminary experiments suggest that some fraction of regulatory elements is missed by standard computational approaches. Therefore, we will construct a "tiling path" across a single locus and test all sequences in the interval for regulatory function, including those showing no overt conservation beyond primates. In Aim 2, we will carry out parallel analyses on zebrafish orthologues of the human genes analyzed in Aim 1. We have accumulated numerous examples to date of human enhancer elements that function appropriately in zebrafish, despite lack of overt sequence similarity to orthologous regions. We aim to identify, for some of these elements, corresponding zebrafish enhancers with similar function;comparison of these sequences can be used to refine computational predictions of regulatory elements. In the last Aim, we will evaluate use of the phage &C31 site specific recombinase in zebrafish, to facilitate re-engineering of existing transgenes in situ. This technology will greatly enhance the utility of the transgenic lines generated in Aims 1 and 2, and will be broadly applicable in zebrafish for other purposes. An important goal of our proposal is to dissect the regulatory elements controlling expression of key genes during development of a single organ system, and test their possible relevance to human disease. However, we also aim to establish a paradigm, and generate reagents, that will have wide applicability to understanding the wealth of comparative information arising out of genome sequencing efforts. PUBLIC HEALTH RELEVANCE: The comparison of genome sequences from multiple organisms has highlighted the surprising degree of sequence conservation outside of gene coding regions, and we believe that many of the conserved sequences are involved in the regulation of gene expression. We have developed an approach in zebrafish to test the ability of DNA sequences to regulate gene expression, and the optical clarity, rapid development, and abundance of the zebrafish embryo allow us to perform these experiments on a scale much larger than practical in other model organisms such as the mouse. Through application of this approach, we aim to establish a paradigm and generate reagents that will aid our understanding of the wealth of information arising out of genome sequencing efforts.