Recent advances in DNA sequencing techniques have led to the determination of the entire genome sequence for ~15 bacteria, s. cerevisiae, and the near completion of the C. elegans genome. New insights into the biological functions and evolution of these organi sms has been gained from this information. A new qualitatively different kind of analysis is possible with complete genome sequence data - that is, the evaluation of apparently missing genes and the potential consequences of their loss on the biology of the organism. To systematically identify potentially missing genes one must first classify genes from a number of organisms into groups of orthologs. Orthologs are genes from different organisms derived from the same g ene in the closest common ancestor of these organisms. They are thus the genes most likely to perform biologically similar functions and ofte n share the greatest sequence similarity. Once these classifications are made, one simply examines the phylogenetic pattern in the ortholog groups to identify potentially lost genes in the studied organism as compared to the reference organisms. - molecular evolution, protein families