This research program focuses on the use of phylogenetic and comparative genomic techniques to study developmental proteins that play a fundamental role in the specification of body plan, pattern formation, and cell fate determination during metazoan development. A variety of bioinformatic approaches are used to understand the evolution and function of these proteins and their ultimate role in human disease. Building upon our prior work on the origin and early evolution of the Hox genes, our focus has turned to analyzing the genomes of early branching metazoan phyla to better-understand the relationship between genomic complexity and morphological complexity, as well as the molecular basis for the evolution of novel cell types. Thematically, our research program aligns quite well with a significant number of themes elucidated in NHGRIs most recent document outlining a vision for the future of genomic research - most importantly, the need to probe the interface between genomics and developmental biology, as well as to conduct comparative, genomics-based research with an evolutionary point-of-view. Until recently, only three of the four non-bilaterian metazoan lineages (Porifera, Placozoa, and Cnidaria) had at least one species whose genome had been sequenced. Ctenophora (the comb jellies) remained as the last non-bilaterian animal phylum without a sequenced genome, and its phylogenetic position remained uncertain. With the goal of understanding the molecular innovations that drove the outbreak of diversity and increasing complexity in the early evolution of animals, we sequenced, assembled, annotated, and analyzed the 150-megabase genome of the ctenophore, Mnemiopsis leidyi (Ryan et al., submitted). Our sequence assembly is available through GenBank, with additional comprehensive genomic information available through our Mnemiopsis Genome Portal (http://research.nhgri.nih.gov/mnemiopsis). Our phylogenomic analyses of the Mnemiopsis genome strongly suggest that the ctenophores are sister to all other animals. Based on analyses of gene content, our results suggest that neural and mesodermal cell types were either lost in Porifera and Placozoa, or that they evolved independently at least twice. These findings challenge long-held ideas regarding not only the phylogenetic position of the ctenophores, but of the evolution of the aforementioned cell types as well. The availability of these sequence data has already begun to benefit multiple scientific communities (i.e., marine, evolutionary, and developmental biologists) and has enabled us to answer some important questions regarding phylogenetic diversity and the evolution of proteins that play a fundamental role in metazoan development. One such example focuses on the vital role that microRNAs play in the regulation of gene expression. Using short RNA sequencing data and the assembled Mnemiopsis genome, we were able to show that this species appears to lack any recognizable microRNAs, as well as the nuclear proteins Drosha and Pasha, which are critical to canonical microRNA biogenesis. (Maxwell et al., 2012). This finding represents the first reported case of a metazoan lacking a Drosha protein. Since our recent phylogenomic analyses suggest that Mnemiopsis may be the earliest branching metazoan lineage, then these findings provide support for the origins of canonical microRNA biogenesis and microRNA-mediated gene regulation post-dating the last common metazoan ancestor. Alternatively, canonical microRNA functionality may have been lost independently in early lineages, suggesting that microRNA functionality was not critical until much later in metazoan evolution. In either case, these data shed light on a point in evolutionary time that may have predated the need for additional plasticity in developmental signaling networks. Our recent completion of the sequencing of Mnemiopsis has also provided us with the opportunity to examine the genome of an organism that uses calcium-activated photoproteins for bioluminescence. We found two genomic clusters containing a total of at least 10 full-length photoprotein genes that likely arose due to multiple gene duplication events, providing the basis for the first metazoan phylogeny for the photoprotein gene family; the phylogeny indicates that this gene family arose at the base of the Metazoa (Schnitzler et al., 2012). We also were able to demonstrate co-localized expression of photoprotein genes and two putative opsin genes in developing Mnemiopsis photocytes, showing for the first time that these cells have the capacity to both sense and respond to stimuli. This is the first reported instance of photoreception and light production co-occurring and being functionally linked in the same cell of a single organism. These findings may shed new light on the evolution of the eye, especially during the Cambrian explosion. Most recently, we have given significant consideration to how these early branching animals could be used in the context of human disease research. While the standardization of methods for studying human diseases in traditional animal models has yielded many clinically actionable results, it has effectively narrowed the breadth of species in which we choose to look for insights. The recent expansion of whole-genome sequence data available from a diverse array of animal lineages provides an opportunity to investigate the feasibility of using non-traditional model organisms to advance human disease research. Cases in which traditional animal models have led to conclusions that are not applicable to humans are becoming more commonplace, and the concern that this may be a growing problem calls for a re-evaluation of how appropriate models are selected for different disease classes. To that end, we have used a comparative genomics approach that encompasses a wide range of animals across the metazoan tree to determine which organisms could serve as viable models for studying various classes of human diseases (manuscript in preparation). We show that some emerging non-bilaterian model organisms have surprisingly high proportions of human disease gene homologs despite their great evolutionary distance from humans; these organisms may confer advantages as animal models in terms of their ease of use, short generation times and cost-effectiveness. Conversely, while it has been previously shown that the genes implicated in the causation of most human diseases are of ancient origin, our results indicate that some disease classes involve a significantly large proportion of genes that appear to have emerged relatively recently within the Metazoa. These disease classes, having a more recent evolutionary history, may be difficult to replicate phenotypically outside of our closest animal relatives. Taken together, these findings demonstrate why model organism selection should be done on a disease-by-disease basis, with evolutionary profiles in mind. Finally, as an outgrowth of our studies on the homeodomain class of proteins, we have developed and continue to maintain the Homeodomain Resource, a curated collection of sequence, structure, interaction, genomic, and functional information on the homeodomain family (Moreland et al., 2009). The Resource is organized in a compact form and provides user-friendly interfaces for both querying and assembling customized datasets. The current release contains 1,623 full-length homeodomain-containing sequences from 32 distinct organisms, 107 experimentally-derived three-dimensional structures, 101 homeodomain protein-protein interactions, 122 homeodomain binding sites, 53 homeodomain proteins with documented allelic variants, and 186 homeodomain proteins implicated in human genetic disorders. The Homeodomain Resource is freely available at http://research.nhgri.nih.gov/homeodomain/.