We are submitting this proposal pursuant to NOT-0D-09-058, NIH Announces the Availability of Recovery Act Funds for Competitive Revision Applications. Our group focuses on understanding how amino acid substitutions disrupt molecular functions that cause human disease. In our currently funded R01, we are developing methods we call in silico functional profiling. This method works by learning residue-specific protein function and then estimates when it is disrupted. This research funds our efforts to characterize what the underlying molecular disruption a protein mutation is causing and thereby improve accuracy of these approaches. In this competitive revision application, we are proposing to expand our efforts to the challenge of understanding genetic disease mutations and polymorphisms that affect gene expression regulation or transcript splicing. Additionally, we have formed collaborations with genetic data managers and will apply all of our methods to aid in their research and identify new testable hypotheses. We will do this in three supplemental aims. First, we will evaluate genomic features for prediction of regulatory nucleotide substitutions and construct new methods to aid in their classification. Second, we will collaboratively work to develop machine learning methods for classification of nucleotide substitutions that disrupt transcript splicing. Finally, we will work to collaboratively annotate genetic data found in inherited disease, pharmacogenetics and somatic mutations in cancer. Together this Recovery Act proposal will fund two groups in bioinformatics and will support trainees, technical staff, and two faculty members.