DESCRIPTION: High-throughput analysis has become an essential tool in genomic research. One area of this research that has not been heavily examined is how mistakes and errors generated in early steps of these analyses propagate through downstream analyses. For many complex, high-throughput genomic analyses, sequence alignment is among the very first steps. This proposal is for a pilot study to examine the feasibility of using computational sequence simulation to examine the effects and propagation of alignment error in high throughput comparative and functional genomic sequence analysis. This study will encompass three simple objectives: (1) To profile the accuracy of DMA sequence alignment, including both paired- and multiple alignments, in order to capture the breadth of simulation and study necessary for answering questions about downstream genomic sequence analysis; (2) To define the factors that need to be included in a simulation of sequences in order to encompass a realistic level of biological complexity, without overparameterizing or adding unnecessary complications, and to create a computer program that can perform these simulations; and (3) A case study - what are the effects of alignment error on the estimation of evolutionary distances among sequences. The results of this project will be used to define and plan a large-scale study of the downstream effects of alignment fidelity in high-throughput sequence analysis, including approaches for downstream analysis that take into account the presumed error of the hypothesized alignment.