The discovery of novel T cell epitopes will greatly facilitate the design and development of improved vaccines by providing critical information needed for the selection of complexes between the major histocompatibility complex (MHC) molecules and antigen peptides that can induce T cell activation. One of the key steps for the epitope identification is the prediction of MHC-peptide binding. Two major classes of MHC molecules are involved in the generation of two types of T cell epitopes. Methods for the prediction of MHC class I epitopes have achieved relatively high accuracy, since the binding motifs of the epitopes are relatively conserved. However, the performance of the prediction methods for MHC class II epitopes are hindered by the variable lengths of the epitopes, the undetermined core region for each individual epitope, and the unknown amino acids as primary anchors. Most of the existing methods attempt to identify binding cores for a set of epitopes through various alignment techniques. Binding motifs or the position specific scoring matrices for prediction can then be assembled from the identified alignment. Motivated by a text mining technique, we have developed a prototype of an supervised learning model for the MHC class II epitope prediction. The idea is to discriminate the core binding nonamers from the non-core nonamers derived from a training set consisting of epitopes and non-epitopes through an iterative process. The characteristics of this model are the simplicity and the capacity of using information both from epitopes and non-epitopes. The preliminary study demonstrated promising performance of this model for HLA-DR4 (Bl*0401) epitopes. In this study, we plan to conduct a thorough evaluation and the optimization of this model. In Aim 1, we will develop the principle for optimization of the model and select the best variant of the method. In Aim 2, we will conduct a thorough evaluation against existing major predictors for various allele specific data. Finally, in Aim 3, we will establish a web server for the prediction of various MHC class II allele-specific epitopes. The system will be freely available to the research community. Our long-term goal is the development of computation methods for prediction of T-cell epitopes. The computational prediction can provide a rapid method for the of pathogen molecules containing immunostimulatory sequences that can serve as targets for immune intervention or diagnostics. [unreadable] [unreadable] [unreadable]