During the current reporting period we have focused on 2 major projects: 1) Genetics of Systemic Juvenile Idiopathic Arthritis As a first step in the analysis of our sJIA GWAS, we focused on the MHC, since the contribution of HLA to sJIA susceptibility has been unclear. Our study included 982 children with sJIA and 8100 healthy control subjects from 9 countries, by far the largest collection of sJIA patients assembled for genetic study to date. Using meta-analysis of directly observed and imputed SNP genotypes and imputed classic HLA types, we identified the MHC as a susceptibility locus with effects on sJIA risk that transcended geographically defined strata. The strongest sJIA-associated SNP, rs151043342 (P = 2.8 x 10e-17, OR = 2.6), is 3 of HLA-DRA and part of a cluster of 482 sJIA-associated SNPs spanning a 400-kb region including HLA Class II. Conditional analysis controlling for the effect of rs151043342 found that rs12722051 (the p.Tyr25Phe missense variant of HLA-DQ1) independently influenced sJIA risk (P = 1.0 x 10e-5, OR = 0.7). Meta-analysis of imputed classic HLA-type associations in 6 study populations of Western European ancestry revealed that HLA-DRB1*11 and its defining amino acid, glutamate 58, were strongly associated with sJIA (P = 2.7 x 10e-16, OR = 2.3), as was the HLA-DRB1*11-HLA-DQA1*05-HLA-DQB1*03 haplotype (P = 6.4 x 10e-17, OR = 2.3). This study solidifies the relationship between Class II HLA and sJIA, suggesting a role for adaptive immunity in its pathogenesis. These data therefore challenge the view that sJIA is an autoinflammatory disease largely driven by innate immune mechanisms, instead suggesting a more complex model. During the current reporting period this work was published in the PNAS. 2) Genetics of Scleroderma in the African-American Population Scleroderma is a chronic multisystem disease that is clinically characterized by progressive fibrosis of the skin and internal organs, vasculopathy, and autoimmunity. It is the cause of significant morbidity and mortality, and the treatment options are not nearly as effective as those available for other autoimmune diseases. African Americans have a higher incidence and prevalence of scleroderma than white Americans. Scleroderma tends to occur at an earlier age in African Americans than in whites, and is more likely to manifest diffuse skin involvement, interstitial lung disease, and pulmonary hypertension. The sibling recurrence risk for scleroderma in African Americans is estimated at 26, while the sibling recurrence risk in whites is estimated to be 15. Several GWAS studies have been reported in whites, but none in the African-American population. Because scleroderma is both more common and more severe in the African-American population, this study has the potential both to identify previously unrecognized multi-ethnic scleroderma susceptibility loci, and perhaps to identify population-specific genes that could have a major impact in this underserved population with more severe disease. We hypothesize that African-identified variants may explain the frequency and severity of scleroderma in African Americans, and admixture mapping is an effective tool for delineating such ancestry-related effects. Through a collaborative group denoted GRASP (Genome Research in African American Scleroderma Patients), we have secured the largest collection of African American scleroderma patients ever assembled. GRASP consists of 19 centers outside of the NIH: Johns Hopkins, Georgetown, George Washington, University of Pennsylvania, University of Pittsburgh, Rutgers, New York University, Hospital for Special Surgery, Medical University of South Carolina, Emory, Miami, University of Alabama at Birmingham, Tulane, Northwestern, University of Chicago, Michigan, University of Texas at Houston, University of California San Francisco, and Stanford. With help from the Scleroderma Research Foundation, in the first phase of the project we have collected DNA samples from 1052 African American scleroderma patients, and we have identified 1039 antinuclear antibody (ANA) negative African-American controls provided by Charles Rotimi. In the second phase of the project, we performed whole-exome sequencing on 400 African-American scleroderma cases and 400 controls using a Nimblegen capture kit that targets 64 Mb of coding exons and miRNA regions, plus 32 Mb untranslated regions. We are currently analyzing rare variant data with burden tests, similar to the approach taken for Behcets disease, and we will employ the remaining 600 cases and 600 controls as a replication set. We have also identified 20,000 exonic variants prioritized for deleteriousness for inclusion as custom content on an Illumina OmniExpress Exome array. The latter arrays have now been fabricated. In addition, we are genotyping all 1052 cases and 1039 controls on the Illumina MEGA array, which contains 1.6 million SNPs enriched for African ancestry content. During the next year, we plan to analyze the whole-exome sequence data for rare variants and attempt to replicate promising associations in the remaining cases and controls. Genotyping on the MEGA array should be complete by the end of calendar year 2016, and genotyping on the OmniExpress Exome array should be complete by mid-2017. As with BD, we will attempt to identify both common and rare variants associated with disease, and we will additionally use admixture mapping to identify African-derived scleroderma-associated common variants. Current data are primarily related to whole-exome sequence quality control. Principal component analysis for ancestry demonstrates that cases and controls are well matched, and that their distribution is very similar to the African-American population in the 1000 Genomes Project. Single variant analysis using the common variants in the autosomes showed HLA-DQB1 as the strongest association (P approximately 10e-6). Although not genome-wide significant, it is encouraging because this locus is the strongest hit in GWASs conducted in Caucasian populations, and therefore can serve as a positive control. We identified 2,325,380 filtered variants, 388,093 coding variants, 56,465 novel coding variants, and 176,922 deleterious or damaging variants.