This research program focuses on the use of phylogenetic and comparative genomic techniques to study developmental proteins that play a fundamental role in the specification of body plan, pattern formation, and cell fate determination during metazoan development. Our group uses a variety of bioinformatic approaches to understand the evolution and function of these proteins and their ultimate role in human disease. Our focus has turned to analyzing the genomes of early branching metazoan phyla in an effort to better-understand the relationship between genomic and morphological complexity, as well as the molecular basis for the evolution of novel cell types. Thematically, our current research interests are centered on probing the interface between genomics and developmental biology and conducting comparative, genomics-based research with an evolutionary point of view, themes elucidated in NHGRIs most recent document outlining a vision for the future of genomic research. Until recently, only three of the four non-bilaterian metazoan lineages (Porifera, Placozoa, and Cnidaria) had at least one species whose genome had been sequenced. Ctenophora (the comb jellies) remained as the last non-bilaterian animal phylum without a sequenced genome, and its phylogenetic position remained uncertain. With the goal of understanding the molecular innovations that drove the outbreak of diversity and increasing complexity in the early evolution of animals, we sequenced, assembled, annotated, and analyzed the 150-megabase genome of the ctenophore, Mnemiopsis leidyi (Ryan et al., 2013). By addressing the void in the availability of high-quality, genome-scale sequence data in a critical part of the evolutionary tree, we were able to bring resolution to the question of the phylogenetic position of the ctenophores, with the results of our phylogenomic analyses strongly suggesting that ctenophores are the sister group to all other animals. Based on analyses of gene content, our results also suggest that neural and mesodermal cell types were either lost in Porifera and Placozoa or that (to some extent) these cell types evolved independently in the ctenophore lineage. These findings challenge long-held ideas regarding not only the phylogenetic position of the ctenophores, but of the evolution of the aforementioned cell types as well. The sequence data generated in the course of this project are available through GenBank, with additional comprehensive genomic information available through our Mnemiopsis Genome Project Portal, located at http://research.nhgri.nih.gov/mnemiopsis (Moreland et al., 2014). The availability of these sequence data has already begun to benefit multiple scientific communities (i.e., marine, evolutionary, and developmental biologists) and has enabled us to answer some important questions regarding phylogenetic diversity and the evolution of proteins that play a fundamental role in metazoan development. One such example involves the Sox genes, a family of transcription factors that play a key role in developmental regulation in animals. True Sox genes had previously been identified in all animal lineages for which genomic sequence data were available, but these genes have not been found outside the Metazoa, indicating that this gene family arose at the origin of the animals. Now, having the whole-genome sequence of Mnemiopsis leidyi in-hand, we were able to examine the full complement and expression of the Sox gene family in the earliest branching animal lineage (Schnitzler et al., 2014). Our phylogenetic analyses of the Sox gene family were generally in agreement with previous studies and placed five of the six Mnemiopsis Sox genes into one of the major Sox groups: SoxB (MleSox1), SoxC (MleSox2), SoxE (MleSox3, MleSox4), and SoxF (MleSox5), with one unclassified gene (MleSox6). We also investigated the expression of five out of six Mnemiopsis Sox genes during early development. Expression patterns determined through in situ hybridization generally revealed spatially restricted Sox expression patterns in somatic cells within zones of cell proliferation, as determined by EdU staining. Our results are consistent with the established role of multiple Sox genes in the maintenance of stem cell pools. In light of our recent phylogenetic evidence that Ctenophora is the earliest branching animal lineage (Ryan et al., 2013), our results are consistent with the hypothesis that the ancient primary function of Sox family genes was to regulate the maintenance of stem cells and function in cell fate determination. Our work has also focused on how these early branching animals could be used in the context of human disease research, an enticing proposition since non-bilaterians contain a surprisingly high number of human disease gene homologs within their genomes, despite their evolutionarily distant position with respect to humans. We used a comparative genomics approach encompassing a broad phylogenetic range of animals with sequenced genomes to determine the evolutionary patterns exhibited by human genes associated with different disease classes (Maxwell et al., under review). Our results support previous claims that most human disease genes are of ancient origin but, more importantly, we also demonstrate that several specific disease classes have a significantly large proportion of genes that emerged relatively recently within the metazoans and/or the vertebrates. An independent assessment of the synonymous to non-synonymous substitution rates of human disease genes found in mammals reveals that disease classes that arose more recently also display unexpected rates of purifying selection between their mammalian and human counterparts. Our results reveal the heterogeneity underlying the evolutionary origins of (and selective pressures on) different classes of human disease genes. For example, some disease gene classes appear to be of uncommonly recent origin (specifically, vertebrate-specific genes) and, as a whole, have been evolving at a faster rate within mammals than the majority of disease classes having more ancient origins. The novel patterns that we have identified may provide new insight into cases where studies using traditional animal models were unable to produce results that translated to humans. Conversely, we note that the larger set of disease classes do have ancient origins, supporting the proposition that non-bilaterian animals have the potential to serve as viable models for studying various important classes of human diseases. Taken together, these findings emphasize why model organism selection should be done on a disease-by-disease basis, with evolutionary profiles in mind. Finally, as an outgrowth of our studies on the homeodomain class of proteins, we continue to maintain the Homeodomain Resource, a curated collection of sequence, structure, interaction, genomic, and functional information on the homeodomain family (Moreland et al., 2009). The Resource is organized in a compact form and provides user-friendly interfaces for both querying and assembling customized datasets. The current release contains 1,623 full-length homeodomain-containing sequences from 32 distinct organisms, 107 experimentally-derived three-dimensional structures, 101 homeodomain protein-protein interactions, 122 homeodomain binding sites, 53 homeodomain proteins with documented allelic variants, and 186 homeodomain proteins implicated in human genetic disorders. The Homeodomain Resource is freely available at http://research.nhgri.nih.gov/homeodomain/.