I am a postdoctoral scholar associated with Steven Brenner's lab at Berkeley, working on structural biology and computational genomics. My long-term vision is to develop new algorithms for inferring protein evolution and function from sequence and structure. Currently I am working on algorithms that can automatically classify a protein into its proper superfamily. The long-term goal of this project is to improve the accuracy of protein structure classification and function prediction. [unreadable] [unreadable] The superfamily defines ancient protein homology. Protein superfamily classification remains a challenging task, even when 3D structure is available. Currently this task still requires experts' manual work. We believe that the classification of protein superfamilies relies on the integration of sequence information and structure information. We will employ recent breakthroughs in kernel-based machine learning approaches for combining different sources of information. We will also develop structure-based discriminative profile models for protein superfamilies. We expect these algorithmic developments will not only result in a practical tool for superfamily classification, but they will also improve our understanding of the interplay of sequence and structure on defining very remote homology. [unreadable] [unreadable] We will extend our structure-based discriminative profile models for protein classification to function prediction. We will develop new methods for the identification of structure-sequence signatures of protein functioin. In addition, we will extend the graph theoretical models for multiple sequence alignment I developed during my Ph.D. study to meet the challenge of domain annotation for large new sequence set. [unreadable] [unreadable] The advancement of medical research is partly based on our detailed understanding of the functions of genes and proteins. My research will improve our understanding of protein evolution and function at the molecular level. Our computational approach will speed up the discovery of biological knowledge from large data sets generated by high-throughput methods. [unreadable] [unreadable] [unreadable]