The population genetic structure of group A Streptococcus pyogenes will be investigated by multi locus sequence typing (MLST) of a globally representative sample of 600 strains using 7 distinct gene loci. Data from an additional 200 stains will have been obtained with other funding prior to the beginning of this project. 276 bacteria will be selected for MLST to represent the diversity of emm sequence typing (emmst), of which 146 types are currently known, and from as diverse sources and time periods as possible. Additional emmst will be performed. The data will be analyzed by cluster analysis of differences in alleles, indices of association, and phylogenetic trees to determine how important the role of recombination is. The speA and speC alleles of the individual strains will also be tested. The population distributions of the M-protein and the housekeeping genes tested by MLST will be compared to determine how well they correlate. These analyses will test the hypothesis that the M protein has led to more discrete strains in the emm pattern A-C subpopulation than in the emm pattern D strains. A second hypothesis is that in emm pattern E strains, two sets of polymorphic antigens (emm plus sof) elicit immune responses and immunity has selected strains that do not overlap in these antigens (Gupta's two epitope model). To test this hypothesis, polyclonal antisera will be generated against overlapping recombinant sof polypeptides and the type-specific sites for binding mapped by determining which overlapping region reacts in the different recombinant peptides. That type-specific region will be sequenced from all pattern E isolates. Two-way contingency tables will be used to determine whether the combinations of emmst and sof alleles are as predicted by the hypothesis or random. Emm-based binding sites for human IgG subclasses, IgA, plasminogen and fibrinogen will be screened by Southern hybridization and the data will be analyzed similarly to above. An additional 325 strains will then be chosen based on the results from the above analyses to analyze short term epidemiology from defined situations. The same analyses will be performed as with the first set of strains. Finally, the antisera to emm and sof products will be analyzed by competitive inhibition, opsonization and for inhibition of binding of ligands to the bacteria. Cross 0- reactivity between different M proteins will be correlated with the phylogenetic analyses and models for selection by immunity. Additional antisera will be obtained and tested after immunization with type-specific epitopes of emmst and sof gene products. These sera will be tested to determine whether strain structure is determined by the immunogenicity of the emm and sof gene products.