This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. Repeat elements are prolific and are estimated to account for 42% of the human genome and specifically, LINE-1 repeat elements (L1s) make up approximately 20% of the DNA of some mammals and at least 30% of human DNA. They have been implicated in disease onset. Research has shown that the distribution of L1s on the human X chromosome is significantly different than autosomes. Their evolutionary impact remains unclear, because no comparative survey of their distribution across several genomes has been done. In fact, several recent studies have pointed to the beneficial nature of comparing the human genome to non-mammalian vertebrates (e.g., fish) for detecting regulatory elements. It is possible that repeat elements, often referred to as junk DNA, may play an important role in genetic change. And since these areas of duplicated material may contain answers to complex disease traits, their locations, content, and distribution must be studied. The research question will be to explain the differences between mammalian and non-mammalian orders. This has health implications because transposition in humans has caused diseases, and if there is a transposition control mechanism in non-mammals then it could indicate a way to prevent this sort of disease. Students working in the applicant's lab will primarily be computer science majors who are interested in learning bioinformatics tools and techniques for conducting research. Thus, they will have an interest not only in research, but also in developing analysis and automation tools for research. The list below includes potential research projects and other projects to further the work of the lab and for making resources available to future students. 1. developing scripts to automate the access and maintenance of genome sequence databases locally 2. acquiring, installing, and learning to use software applications required to complete the analyses (e.g., RepeatMasker, Censor, R, Perl, gridMathematica, etc.) 3. comparing the outputs of the different repeat detection programs 4. comparing the distribution of repeat elements (such as L1s, ALUs, SINEs, etc.) on sex chromosomes versus autosomes across species 5. comparing the distribution of repeat elements (such as L1s, ALUs, SINEs, etc.) on sex chromosomes versus autosomes between classes 6. comparing the distribution of repeat elements (such as L1s, ALUs, SINEs, etc.) of a single chromosome across species 7. comparing the distribution of repeat elements (such as L1s, ALUs, SINEs, etc.) across chromosomes of varying arm length 8. exploring, understanding, and attempting to improve underlying analysis algorithms