Abstract: Technology and Research Development Project #3: Bioinformatics is responsible for the development of tools and methods for the analysis of data from the other technology projects that are part of this proposal. This takes advantage of the data analytics and informatics strengths of UTSW and UCSC to help produce a broad-scale molecular and chemical annotation of large libraries of natural products and botanicals. Our deliverable will be a data and data analytics pipeline that will help define a cytological signature to chemical entities, map chemical entities to signaling pathways they interrupt, help identify principal nodes in normal and diseased biological networks that natural products and botanicals target, and help indicate potential therapeutic/dietary applications for these compounds. This will be accomplished by using a series of data analysis methods including 1) Clustering analysis of natural products fractions and botanical activity by the affinity propagation clustering (APC) algorithm. This algorithm can be applied to data features generated by cytological profiling, unbiased metabolomics and FUSION. 2) Use of Euclidean distance distribution to generate a similarity matrix for all of the chemical and genetic perturbagens in FUSION to obtain ?guilt by association correlations. 1) Develop tools for intuitive data visualization based on phylogenetic tree algorithms. These tools will be developed for visualization of not only the data generated by TRD#1 and TRD#2, but by other large scale methods looking at natural product mechanisms of action. The ultimate goal of the bioinformatics component of this grant is to drive the generation of hypotheses on the biological activity of the chemical being evaluated and provide this information to the scientific community.