This research program focuses on the use of phylogenetic and comparative genomic techniques to better-understand the evolution of homeodomain transcription factors and their role in pattern formation and cell fate determination during metazoan development. A variety of bioinformatic approaches are used to understand the evolution and function of these proteins and their ultimate role in human disease. Homeobox (or Hox) genes are organized in conserved genomic clusters across a range of phylogenetic taxa. Over evolutionary time, the functional diversification of these Hox genes has contributed to the diversification of animal body plans. Building upon our prior work on the origin and early evolution of these Hox genes, our focus has turned to analyzing the genomes of early-branching metazoan phyla to better-understand the relationship between genomic complexity and morphological complexity, as well as the molecular basis for the evolution of novel cell types. Until recently, only three of the four non-bilaterian metazoan lineages (Porifera, Placozoa, and Cnidaria) had at least one species whose genome had been sequenced. Ctenophora (the comb jellies) remained as the last non-bilaterian animal phylum without a sequenced genome, and its phylogenetic position remained uncertain. With the goal of understanding the molecular innovations that drove the outbreak of diversity and increasing complexity in the early evolution of animals, we sequenced, assembled, annotated, and performed a preliminary analysis of the 150-megabase genome of the ctenophore, Mnemiopsis leidyi. Our sequence assembly was deposited into GenBank earlier this year, and the paper describing this genome will be submitted for publication shortly (Ryan et al., in preparation). Thematically, our sequencing project (and subsequent analyses) aligns quite well with a significant number of themes elucidated in NHGRI's most recent document outlining a vision for the future of genomic research -- most importantly, the need to probe the interface between genomics and developmental biology, as well as to conduct comparative, genomics-based research with an evolutionary point-of-view. The availability of these sequence data has already begun to benefit multiple scientific communities (i.e., marine, evolutionary, and developmental biologists) and has enabled us to answer some important questions regarding phylogenetic diversity and the evolution of proteins that play a fundamental role in metazoan development. For example, during this reporting period, we were able to identify and characterize all known LIM domain-containing proteins in six metazoans and three non-metazoans (Koch et al., 2012). Using this data set, phylogenetic analyses were performed, yielding a number of novel non-LIM domains and motifs in each of these proteins. This allowed us to formalize a classification system for the LIM proteins, determine the relative timing for class- and family-origin events, and identify lineage-specific loss events. This study determined that six of the 14 LIM classes originated in the metazoan stem lineage, and the observed expansion of the LIM superclass at the base of the Metazoa allowed for the increase in complexity required for the transition from a unicellular to multicellular lifestyle. Put otherwise, these evolutionary events recount a critical step in the emergence of multicellularity in animal species. We have also focused on the vital role that microRNAs play in the regulation of gene expression. Using short RNA sequencing data and the assembled Mnemiopsis genome, we were able to show that this species appears to lack any recognizable microRNAs, as well as the nuclear proteins Drosha and Pasha, which are critical to canonical microRNA biogenesis. (Maxwell et al., submitted). This finding represents the first reported case of a metazoan lacking a Drosha protein. Since our recent phylogenomic analyses suggest that Mnemiopsis may be the earliest branching metazoan lineage, then these findings provide support for the origins of canonical microRNA biogenesis and microRNA-mediated gene regulation post-dating the last common metazoan ancestor. Alternatively, canonical microRNA functionality may have been lost independently in early lineages, suggesting that microRNA functionality was not critical until much later in metazoan evolution. In either case, these data shed light on a point in evolutionary time that may have predated the need for additional plasticity in developmental signaling networks. Our recent completion of the sequencing of Mnemiopsis has also provided us with the opportunity to examine the genome of an organism that uses calcium-activated photoproteins for bioluminescence. We found two genomic clusters containing a total of at least 10 full-length photoprotein genes that likely arose due to multiple gene duplication events, providing the basis for the first metazoan phylogeny for the photoprotein gene family; the phylogeny indicates that this gene family arose at the base of the Metazoa (Schnitzler et al., submitted). We also were able to demonstrate co-localized expression of photoprotein genes and two putative opsin genes in developing Mnemiopsis photocytes, showing for the first time that these cells have the capacity to both sense and respond to stimuli. This is the first reported instance of photoreception and light production co-occurring and being functionally linked in the same cell of a single organism. These findings may shed new light on the evolution of the eye, especially during the Cambrian explosion. With our collaborators at Iowa State University, an analysis of the mitochondrial (mt-) genome of Mnemiopsis was performed (Pett et al., 2011). At just over 10 kb, the mtDNA of Mnemiopsis is the smallest animal mtDNA reported to date, and it is also among the most derived. It has lost at least 25 genes, including atp6 and all tRNA genes. It appears that atp6 has been relocated to the nuclear genome and has acquired a mitochondrial targeting presequence. Encoded rRNA molecules possess little similarity with their homologs in other organisms and have highly reduced secondary structures. At the same time, nuclear-encoded mt-ribosomal proteins have undergone expansions, which may compensate for the reductions in mt-rRNA. With our collaborators at the University of Hawaii, we were able to identify a near-complete TGF-beta signaling pathway composed of nine ligands, four receptors, and five Smads, revealing that the core components are present in all metazoans studied to date (Pang et al., 2011). Notably absent are extracellular diffusible antagonists, including Chordin, Follistatin, Noggin, and CAN family members. We examined the expression of these genes during ctenophore development and found expression of ligands to be differentially expressed along all three body axes (i.e., oral-aboral, tentacular, and sagittal). While we do not believe this pathway is necessarily specifying these axes since they are expressed after the axes are already specified, we do believe they are involved with transducing earlier signals. These findings indicate that the TGF-beta signaling pathway was present and most likely active early in metazoan evolution. With few components present in extant non-metazoans, it is highly probable that the emergence of this pathway was a key innovation in the transition to multicellularity in the metazoan ancestor. As an outgrowth of our studies on the homeodomain class of proteins, we have developed and continue to maintain the Homeodomain Resource, a curated collection of sequence, structure, interaction, genomic, and functional information on the homeodomain family (Moreland et al., 2009). The Resource is organized in a compact form and provides user-fr