Influenza A virus causes both pandemic and seasonal outbreaks, leading to loss of from thousands to millions of human lives within a short time period. Vaccination is the best option to prevent and minimize the effects of influenza outbreaks. Rapid selection of a well-matched influenza vaccine strain is the key to developing an effective vaccination program. However, this is a non-trivial task due to three major challenges in influenza vaccine strain selection: labor an time intensive virus isolation and serology-based antigenic characterization, poor growth of selected strains in chicken embryonic eggs during production, and biased sampling in influenza surveillance. Each year, many scientists worldwide, including thousands from the United States, are working altogether to select an optimal vaccine strain. However, incorrect vaccine strains have still been frequently chosen in the past decades. Recent advances in genomic sequencing allow us to rapidly and economically sequence influenza genomes from the isolates and from the clinical samples. Sequencing influenza genomes has become a routine and important component in influenza surveillance. The objectives of this project are to develop a sequence-based strategy for influenza antigenic variant identification and to optimize vaccine strain selection using genomic data. To achieve these aims, we will develop machine learning based computational methods to estimate antigenic distances among influenza viruses by directly using their genome sequences. We will then identify the key residues and mutations in influenza genomes affecting influenza antigenic drift events. Such information will allow us to select most promising virus strains as candidates for vaccine production. Since economical virus production requires the selected virus strains to grow easily in chicken embryonic eggs, we also propose the development of a machine learning based method that can predict the growth ability of a virus strain based on its sequence information. This integrated genome based influenza vaccine strain selection system will be developed for detecting antigenic variants for influenza A viruses. This project will help us provide fundamental technology that employs genomic signatures determining influenza antigenicity and growth ability in chicken embryonic eggs, which are the two key issues for efficient and effective influenza vaccine strain development. The resulting genome based vaccine strain selection strategy will significantly reduce the human labor needed for serological characterization, decrease the time required to select an effective strain that will grow well in eggs, and increase the likelihood of correct influenza vaccine candidate selection. Thus, this project will lead to significant technological advances in influenza prevention and control.