Most symmetric proteins have a relatively small core unit, which is repeated. These are simple structures compared to proteins that are not symmetric. Yet, they appear to be capable of carrying out all types of functions. Some are enzymes, others are carriers of proteins, still others are receptors, etc. Therefore, if one is interested in designing proteins de novo to perform a specific function, symmetric proteins are probably a good start. They should also be good molecules with which to study the sequence-structure-function relations because of their relative simplicity. The evolutionary history of these proteins is also interesting. These proteins probably arose by gene duplication and fusion. Although mutation rates will vary depending on the requirement of symmetry for function, generally those that have highly sequence similar repeats presumably arose late, compared to those for which the similarity is beginning to disappear. After sufficient time, the sequence similarity will disappear and structural symmetry will also be degraded. Thus, the symmetry should generally give an additional handle for following the evolution of these proteins. The interest in symmetric structures seems to be rising;there were only a few reports on symmetry detection prior to 2008, but at least four different groups including us reported separate symmetry detection methods in the past three years. Our symmetry detection program, SymD (Kim et al. BMC Bioinformatics 11:303, 2010), is based on two algorithms that we developed earlier, SE (Seed Extension;Tai et al., BMC Bioinformatics, 10 Suppl 1:S4, 2009) and RSE (Refinement with SE;Kim et al. BMC Bioinformatics 10:210, 2009). SE finds the optimal structure-based sequence alignment given a structure superposition without using the dynamic programming algorithm or a gap penalty. RSE uses SE and the Kabsch algorithm to find the optimal structure superposition and structure-based sequence alignment given an initial structure superposition or sequence alignment. SymD itself works by optimally aligning, using RSE, a protein structure to itself after circularly permuting the second copy by k residues for all k values from 1 to N-3 residues where N is the total number of residues of the protein. The SymD procedure is superior to other symmetric protein detection methods in several aspects: (1) The procedure allows detection of symmetry even when the structure contains symmetry-breaking insertions or deletions either within or between the repeating units. (2) The procedure depends and uses the symmetry of the molecule. It is a symmetry detector, not just a repeat detector. (3) The procedure is sensitive because it amplifies symmetric signal. (4) The procedure yields the sequence alignment between repeating units and the position and orientation of the symmetry axis. (5) The procedure is capable of detecting more than one symmetry for a molecule. Using this program, we determined that approximately 20% of all distinct protein domains (SCOP 1.75 ASTRAL 40% domain dataset) may be considered globally symmetric. These include most of the well-known symmetric folds, including TIM barrels, alpha-alpha superhelices and toroids, beta-trefoils, beta-propellers, leucine-rich repeats, ferredoxins, etc. The symmetries observed are broadly of three types: slip, closed and open. Slip symmetric proteins look invariant after a translation by a few residues in one direction. As far as we know, we are the first to recognize this invariance and to consider it as a type of symmetry (manuscript in preparation). These are mostly helix bundles. In symmetric closed structures, the N- and C-termini of the molecule come close together and the two ends of the molecule are stitched together, often by using a set of hydrogen bonds (the Velcro joining). Most of these have 2- to 8-fold rotational symmetries, but the transmembrane beta-barrels can have higher symmetries and also the screw symmetries. In the symmetric open structures, the N- and C-termini are at the opposite ends of the molecule. All have a helical or a pure 2-fold rotational symmetry. A protein with a pure 2-fold rotational symmetry can have either a closed (intertwined) or an open structure. Current research effort is directed to (1) characterizing the small number of protein domains that have two or more symmetry elements, (2) perfecting the algorithm for automatic classification of observed symmetries, and (3) developing an algorithm for detecting locally symmetric sub-structures that are imbedded in a larger, globally non-symmetric structures. Future efforts will be directed to collecting repeating units and studying their structure and interaction.