This report summarizes the activities of the Complex Disease Genetics Unit (CDGU) of the Genetics and Genomics Branch, which was established to identify genes conferring susceptibility to genetically complex rheumatic and inflammatory diseases. Such disorders are thought to be caused by the interaction of multiple genetic loci with environmental factors, and, as such, are usually not amenable to study by the conventional methods of Mendelian genetics or positional cloning. Family studies entail the nonparametric analysis of large numbers of sibling pairs concordant for the disease in question, rather than model-dependent analyses of smaller numbers of extended families. When chromosomal regions potentially harboring susceptibility genes are identified through sib-pairs, association studies on large collections of independent patients and controls may be necessary to focus attention on intervals tractable for mutational screening. Similarly, this latter linkage-disequilibrium (LD)-based approach can be used to screen candidate genes, and, with improving technology, may also be applied across the genome. The common requriement in all cases is the capacity to genotype large numbers of samples at many genetic loci in a cost-effective and efficient manner. Since its inception two years ago, the CDGU has focused much of its energy on the study of rheumatoid arthritis (RA), a disorder that affects as much as 1% of the population worldwide, and is associated with considerable morbidity and even mortality. This work builds on our long-standing participation in the North American Rheumatoid Arthritis Consortiium (NARAC), a large collaborative group whose initial goal had been to collect 1000 sibling pairs concordant for rheumatoid arthritis and then to perform genome-wide linkage studies. More recently NARAC has also begun to accumulate large collections of singleton cases of RA and age-, gender-, and ethnically-matched controls. Genome-wide linkage data on 667 families analyzed by NARAC in the previous reporting period confirmed a major genetic effect in the HLA region, and also showed evidence of linkage (p less than 0.005) for chromosomes 1p13, 1q43, 6q21, 10q21, 12q12, 17p13, and 18q21. Because the 18q21 region has been replicated in an independent French RA cohort, our initial emphasis has been on this segment of the genome. During the previous reporting period the CDGU had established the Sequenom MassARRAY platform for high-throughput genotyping of single-nucleotide polymorphisms (SNPs), with a capacity of 1 million genotypes per year at a cost of $ 0.20 per genotype. This technology relies upon multiple-base extension reactions that are assayed by mass spectrometry. During the present reporting period, the CDGU has employed new reaction chemistries to double its rate of multiplexing (now up to 12 reactions at a time), while reducing the cost of reagents, which has increased capacity to 2 million genotypes per year at a cost of less than $ 0.10 per genotype. Results of the Last Year Analysis of candidate genes in the 18q21 region by dense SNP analysis and LD-based haplotype analysis: In studies begun in the previous reporting period, we identified several genes that appeared to be functional candidates for RA susceptibility. Dense SNP maps were constructed around these genes, and genotypes from 384 probands and 384 matched controls were utilized to determine the LD structure of each region and to identify common haplotypes in each LD block. SNP genotype and haplotype frequencies in probands were compared with controls. For three candidate genes there was no significant association with RA. These included TCF4 (transcription factor 4, also known as immunoglobulin transcription factor 2, or ITF-2), PMAIP1 (PMA-induced protein 1), and MALT1 (mucosa-associated lymphoid tissue lymphoma translocation protein 1). For the latter gene, one SNP showed a p value of 0.03, but there was no common haplotype association. Initial results from a fourth gene, TNFRSF11A (also known as receptor activator of NF-kappa B, or RANK), were promising. This gene covers 60 kb of genomic DNA from which 15 SNPs defining 4 haplotype blocks were assayed; one haplotype from block 2 was increased in cases (42%) relative to controls (36%), with a p value of 0.02. When an additional 380 RA cases and 370 controls were studied, the difference was diminished, and was no longer statistically significant. Positional analysis of the 18q region: Early in the present reporting period, a new genotyping technology utilizing ultra-high levels of multiplexing and a bead-based optical technology (Illumina) became available to NARAC on a contract basis. Utilizing this approach, 460 cases and 460 controls were assayed for 3072 SNPs spanning a 10 Mb region centered on the peak marker in the 18q region. Within two months, a total of 2719 SNPs were successfully typed (2.5 million genotypes), and, of these, 2297 had a minor allele frequency greater than 5% (2.1 million genotypes), for an average marker density of 4.3 kb. Four clusters of LD were identified. The CDGU has now focused on two of these clusters, utilizing an additional 613 cases and 518 controls and signature SNPs for the various haplotype blocks deduced from the initial dataset. Combined data from the first region identified two signature SNPs with p values of 0.0085 and 0.0076 and odds ratios of 1.19 and 1.20, respectively. In the second region, one signature SNP gave a p value of 3.61 times 10 to the minus 6, with an odds ratio of 1.46. Genotype association results were even more significant for the latter marker, with a p value of 1.7 times 10 to the minus 6, and an odds ratio of 1.57. The haplotype block containing this SNP covers approximately 60 kb of genomic sequence, and contains one novel gene of unknown function. Analysis of candidate genes on other chromosomes: We also analyzed a number of other RA candidate genes on chromosomes other than 18. CARD15/NOD2 is located at 16q12.1, and mutations have been associated with Crohn?s disease, Blau syndrome, and psoriatic arthritis. Among 376 RA cases and matched controls, we found no association of Crohn?s disease mutations, or of any of the CARD15 common haplotypes, with RA. We also studied RUNX1 (22q22.12) and two genes with RUNX1 binding sites, SLC9A3R1 (17q25.1) and SLC22A4 (5q23.3), all of which have been associated with arthritis in various populations. In 376 cases and controls, we found only modest association of a RUNX1 haplotype (p equals 0.008) with RA. Our most significant results to date in the non-chromosome 18 candidate genes are for PADI4 (1p36.13), which has been associated with RA in the Japanese population. Among 573 cases and 751 controls, we observed a haplotype association with a p value of 0.0001. Collaborative studies of candidate genes: NARAC collaborators at Celera Diagnostics have been engaged in studies of additional candidate genes. This collaboration recently resulted in the identification of a functional RA-associated SNP in PTPN22, a hematopoietic-specific protein tyrosine phosphatase. The risk allele is present in 28% of RA patients and 17% of healthy controls, and is also associated with type-1 diabetes. Conclusions and Significance The data of the last year support the notion that multiple non-HLA genes confer modest levels of risk for RA. Although we were not able to confirm an association of RANK on chromosome 18q, our new data suggest the strong possibility that another gene in this region is involved in RA susceptibility. During the next year, we plan to focus considerable attention on the chromosome 18q candidate gene noted above. As a part of the NARAC collaboration, we will also continue to conduct in-depth analysis of regions identified by more global screens.