Influenza and influenza related complication lead to more than 200,000 hospitalizations and approximately 36,000 deaths in the United States each year, and vaccination is the primary option for reducing influenza effect. A large amount of global efforts has to be made each year to identify antigenic variants and decide whether new vaccine strains are needed. Current laboratory based antigenic characterization processes are labor intensive and time consuming, and it has been the bottleneck for generating an effective influenza vaccination program. A robust method without such a laboratory characterization is demanding for rapid identification of influenza antigenic variants. This project proposes to develop a novel sparse multitask learning method in predicting influenza antigenic variants solely based on the input of protein sequences, and further to apply this method in mapping antigenic drift pathway of A/H3N2 influenza viruses and studying antigenic drift patterns leading to influenza outbreaks. This method is based on the assumption that influenza antigenicity would be determined by certain features in hemagglutinin (HA) protein sequence and tertiary structure. This assumption was well evidenced that the viruses with conserved HAs generated cross-reactions in serological reactions and also provided cross- protection in both laboratory experiments and field practices. The proposed method is novel since it combines multitask learning and sparse learning. Therefore not only this project will develop significant technology for antigenic variant screen, but also new machine learning methods. This project will facilitate vaccine strain selection since the proposed method can potentially reduce and even eliminate serological assay, one of the most labor intensive procedures, in influenza surveillance. In addition, the antigenicity specific features and the drift patterns causing influenza outbreaks to be identified in this study will enhance our understanding about antigen-antibody interaction thus enhance our knowledge in influenza immunology and serology. Furthermore, the proposed method is potentially applicable in characterizing antigenic properties of other pathogens with significant antigenic variations, for example, rotavirus. The specific aims are the following: (1) Development of a novel sparse multitask learning method in generating antigenic distance matrix using hemagglutinin inhibition (HI) data;(2) Development of a quantitative method for predicting antigenic variants in silicon;(3) Application of this method in studying seasonal influenza antigenic drift pathway and antigenic drift patterns leading to influenza outbreaks. This nature of this study is to address a novel predictive method for measuring antigenic divergence between influenza viruses, which is critical in influenza vaccine strain selection. Thus, we are submitting this project to the broad challenge area (06) Enabling Technologies and fit for the Specific Challenge 06-GM-103: development of predictive method for molecular structure, recognition, and ligand interaction. PUBLIC HEALTH RELEVANCE: This study is to develop a novel computational method for influenza antigenic variant prediction, which is very useful in influenza vaccine strain selection. This method will also be applied in studying antigenic drift patterns leading to influenza outbreak and epidemics.