Sequence alignments of homologue proteins from evolutionary distant organisms are used to pinpoint regions of structural and functional importance. Over long periods only the most constrained segments retain a detectable similarity with each other. This concept was extended to the whole database, by performing cross-comparisons of comprehensive sets of sequences from various kingdoms and phyla with evolutionary distances ranging from 2 billion years for the eukaryote/eubacteria divergence to 550 million years for the coelomate radiation. Significant similarities b tween these sets thus correspond to strongly conserved ancestral features. Using a series of matching/orthogonalization procedures, 500 independ ancestral types were detected within contemporary sequences. This fossil set only represents 4% of the original database but significantly matches 40 % of the whole. Thus, it realizes a 10-fold enrichment in sequences of the greatest structural/ functional significance and is an optimal source for the definition of motifs. Approximately 200 of those highly conserved sequences correspond to proteins the role of which is not obviously central, and warrant further analysis. Theoretical computations suggest that the 500 ancestral types defined so far constitute most of the fossil sequences detectable in modern sequences. This is consistent with another independent study comparing 3 large new datasets: partial cDNAs from human and nematode and ORFs from chromosome III of yeast. Thus, known proteins might already include representative for most ancestral features antedating the coelomate radiation. A database of 550 ancient sequence prototypes has been constituted and made available to the community by computer network.