Recent advances in DNA sequencing techniques have led to the determination of many entire genome sequences. New insights into the biological functions and evolution of these organisms has been gained from this information. A new qualitatively different kind of analysis is possible with complete genome sequence data - that is, the evaluation of apparently missing genes and the potential consequences of their loss on the biology of the organism. To systematically identify potentially missing genes one must first classify genes from a number of organisms into groups of orthologs. Orthologs are genes from different organisms derived from the same gene in the closest common ancestor of these organisms. They are thus the genes most likely to perform biologically similar functions and often share the greatest sequence similarity. Once these classifications are made, one simply examines the phylogenetic pattern in the ortholog groups to identify potentially lost genes in the studied organism as compared to the reference organisms. In addition, global properties of proteins may be studied from a genomic perspective, for example, the relationship of sequence length and conservation.