This renewal application has the overall goal of identifying all of the major common genetic variants that underlie susceptibility to rheumatoid arthritis, and to begin to identify rare susceptibility alleles, if they exist. In preliminary data we have identified a number of candidate genes and regions on the basis of linkage analysis in multiplex RA families, as well as by whole genome association studies using approximately 550,000 SNPs on a panel of over 900 RA patients and matched controls. We now wish to identify the specific causal variants and understand their mode of action. In specific aim 1 we will identify the causal genetic variants within the common genes that confer risk for rheumatoid arthritis. We have already identified several genes and regions of interest, including STAT4 on chromosome 2q. In specific aim 1a we will replicate these initial associations in case-control datasets totaling up to 5,000 patients. Various methods of genomic control for population stratification will be utilized for these replication studies. In specific aim 1b we will carry out fine mapping of candidate regions. This will generally involve haplotypic analysis using custom sets of SNP markers. In specific aim 1c we will utilize various approaches to identify the likely causative genetic variants in the gene under study. Examples of the approaches to be used in specific aim 1c are given for STAT4. In specific aim 2, we will apply a staged approach to identify gene-gene and gene-environment interactions that contribute to RA susceptibility. The top performing markers in the univariate analyses of specific aim 1a and 1b will be examined for interactions using Classification and Regression Tree (CRT) as well as traditional logistic regression methods. Top performing models will be tested in replication datasets of cases and controls. In specific aim 3, we will identify rare genetic variants that contributes to RA susceptibility. This specific aim is based on preliminary analysis indicating that "slightly deleterious" SNPs (sdSNPs) are a significant component of the genetic burden underlying complex disease. These sdSNPs are enriched in the low frequency (MAF <5%) component of the SNP population. We will initially investigate a limited number of candidate genes with high-throughput sequencing on the Solexa platform, along with follow up analysis in large case control datasets. Larger scale and more comprehensive approaches to this issue may be employed in the later years, depending on technical advances in the field.