The genomes of most eukaryotes include vast numbers of interspersed repeats (IRs), which are the remnants of mostly selfishly amplified transposable elements. Transposable elements have an exceptionally wide-ranging mutagenic effect on genomes, while recognition of IRs provide unparalleled information on genome evolution and is crucial in many aspects of bioinformatics. This grant would support maintenance and further development of RepeatMasker, a computational tool that has become the de facto standard for identification and characterization of IRs, and the GESTALT Workbench, a graphical user interface for detailed visualization of RepeatMasker results in their genomic context. The source codes of these tools are freely available to the academic community. Development will emphasize the following: a) RepeatMasker needs to be rewritten to allow expansion along with the increasing amount and variety of genomic sequence data. b) "Phylogenetic interpolation" will be used to construct repeat libraries for species for which only limited sequence is available (e.g. macaque, dog). This will be done by proper selection of the relevant subset of IR families from related species, and by applying appropriate, lineage-specific alignment parameters. c) Major expansion of the RepeatMasker functionalities of contamination detection and masking of only lineage specific IRs to facilitate the generation of interspecies genomic alignments. d) Building a public web server for RepeatMasker and GESTALT at the Institute for Systems Biology. This server will enable real-time analysis of private sequences, as well as offer pre-computed RepeatMasker results for all publicly available genomic sequences. It will also include novel repeat-based analysis services, such as genome sequence comparison, contamination detection and transcript prediction.