Schizophrenia is a genetically complex disease, with a heritability of 70-80% and prevalence of 0.5-1.0%. There are robust associations for rare copy number variants and for common SNPs. The strongest common- SNP signal spans the Major Histocompatibility Complex (MHC, chromosome 6p21) which includes the human leukocyte antigen (HLA) loci. The association has grown stronger each time the international sample size has increased, with the current lowest p=2.2x10-12 (Psychiatric GWAS Consortium). SNP odds ratios are modest and localization is difficult due to extensive linkage disequilibrium (LD). Most of the complex disease associations in the MHC map at least in part to the effects of the classical HLA class I and II loci. We hypothesize that HLA effects are etiologically important in a subset of schizophrenia cases, and are related to auto-immune processes, infection, and/or the role of MHC Class I loci in neuronal plasticity. It is therefore critical to dissect the effects ofHLA variants on schizophrenia risk. Direct demonstration of HLA effects would open a major new line of research into mechanisms, treatment and prevention of schizophrenia. New treatment and prevention strategies would represent a major breakthrough for this chronic and disabling disease, even if they proved to be relevant only to an HLA-linked subset of cases. HLA alleles have not been directly typed in any large schizophrenia sample. Dissection of HLA effects typically requires testing of a large discovery sample and of samples from additional ethnic populations, to detect true signals (alleles and haplotypes) against diverse patterns of LD. The highly repetitive sequences in this region make accurate sequencing and haplotyping difficult by most methods. We propose to sequence the 11 genes in the 8 most polymorphic HLA loci in large schizophrenia case-control NIMH repository-based samples of European, African-American and Chinese ancestry using a new cost-effective method. The entire targeted block of almost all exons and introns in 11 HLA genes in 8 loci (A, B, C, DRB1/3/4/5, DQA1, DQB1, DPA1, DPB1) will be amplified by long-range PCR, followed by very deep second-generation sequencing (Illumina HiSeq2000 platform, 100bp paired-end reads) to detect known and novel alleles. This method allows full phasing of polymorphisms within each gene, avoiding ambiguity in allele calls, and yields accurate determination of functional differences among proteins and intronic variants that can alter gene expression. An ongoing QC strategy is proposed. It is proposed to study 21,900 individuals: 7000 European-ancestry (EA) cases and 7000 controls from the Molecular Genetics of Schizophrenia (MGS) and Genomic Psychiatric Cohort (GPC) samples; 2000 African- American cases and 2000 controls (ancestry-matched) from the MGS and PAARTNERS samples; and 1300 Chinese case-parent trios from the NIMH Taiwanese sample. Association with schizophrenia risk will be analyzed for haplotyped alleles at the individual HLA loci, and then for the sequence features which include single amino acid changes, and more complex features (which can more strongly predict disease risk) such as sets of variants which predict the same structural or functional changes in the expressed protein. Conditional analyses will be performed to determine the primary HLA association as well as secondary effects. The most strongly predisposing and protective effects at the primary locus will be further dissected to identify the crucial sequence, structural and functional variations. Data will be shared with the NIMH repository and with the Immunogenomics Data Analysis Working Group (IDAWG, www.igdawg.org) which is developing standards to incorporate new sequence data. Thus, this will be the first disease association study to sequence and accurately haplotype the exons and introns of polymorphic HLA loci in very large samples. The proposed analyses will allow us to dissect HLA associations with SCZ and will make major contributions to the understanding of HLA sequence variation.