The majority of cancer presents as a complex phenotype and is manifest through gene-gene, and/or gene-environment interactions. An ideal paradigm for the investigation of complex cancer phenotypes in humans is primary hepatocellular carcinoma (HCC). Primary liver cancer is the third most common cause of cancer related deaths with a rising incidence in western countries The development of HCC is associated with several major risk factors including chronic hepatitis B and C infection, exposure to aflatoxin and liver cirrhosis (LC). The variability in outcome following the same environmental exposure and the familial clustering of HCC suggest genetic susceptibility. Our previous study identified EPHX and GSTM1 as HCC susceptibility loci in a Chinese population. Both genes are involved in detoxification of aflatoxin in hepatocytes. The goal of this project is to examine genetic analysis to assess the role of genes in well-described pathways in determining primary hepatocellular carcinoma (HCC). This approach merges gene mapping and candidate locus studies by including as candidates all the members of a pathway. Each gene of interest is "tagged" with multiple polymorphic sites, in or near it, to identify genetic factors modulating the risk of developing HCC among populations exposed to AFB1. The individual members of each family (GSTA1, GSTM1, GSTM3, GSTP, GSTT1, GST12, EPHX1, EPHX2, GSTA4, GSTT2, GSTZ1, STP, COMT, ESD, DTD, CYP, MGST1) have been tagged with new or published polymorphisms, and their role in HCC risk examined, in a nested case-control population. The loci GSTM1, GSTP, GSTT1, EPHX1 showed significant association with HCC risk while the EPHX2 locus was associated with age of onset. When results were stratified by the HBV status of the case, GSTM1 and GSTT1 were associated only in the HBV(+) cases, while GSTP was associated in the HBV(-) cases. These results indicate that these genes are candidates for more detailed functional and genetic analysis. Candidate gene variation at the 15 candidate cancer susceptibility loci are currently being examined in a large case-control study (n=550 cases and 550 controls). Genetic information in complex trait analysis may be accessible from the joint study of heritable variation and somatic (tumor) variation in cancer. HCC tumor/normal pairs were examined using a collection of genome-wide simple tandem repeat polymorphism (STRP) markers, candidate loci, and the 1,300 single nucleotide polymorphisms (SNPs) present on the Affymetrix HuSNP chip. This data was analyzed to identify regions of loss of heterozygosity (LOH), and was correlated with gene expression data collected from the same samples using Affymetrix HG-U95A chips containing 12,000 characterized genes. More than 16 LOH signatures of HCC were generated across 22 chromosomes. We found that the number of cancer genes (tumor genes and tumor suppressor genes) was significantly higher in regions of LOH relative to regions of non-LOH. In addition, through phylogeny reconstruction studies we demonstrated that these LOH signatures correlate significantly with gene expression results;and identified two LOH signatures, 4q13.3 and 17q11.2 that may be important in generating the HCC LOH signature. This study has now been expanded to include expression data using the Affymetrix HG-U133 chips ( 45,000 probe sets) and SNP data for refining the regions of LOH using the Affymetrix Mapping 10K Array (10,000 SNPS). Data has also been generated to investigate the relationship of chromosome copy number and loss of heterozygosity using in-house algorithms and the Affymetrix CCNT tool. Additional experiments are being carried out using the latest Affymetrix SNP6.0 arrays. Each single SNP Array 6.0 has over 1.8 million total markers for genetic variation (including &gt;900,000 SNPs and &gt;940,000 copy number probes) for genetic analysis. Using Affymetrix SNP6.0 arrays, we generated genome wide genotyping data from 550 cases and 550 controls. In addition, there are 20 pairs of tumor/normal liver tissues analyzed on the same platform. The estimated total number of genotypes is 1.1 billion. Genotype calls for (HCC) samples were generated with the Affymetrix Power Tools. SNPs not in Hardy-Weinberg Equilibrium in controls (p &lt;0.001) were excluded from further analysis. Single Nucleotide Polymorphism (SNP) association analysis was performed with PLINK, using a logistical model;p-values were adjusted by Bonferroni correction. For pathway analysis, the 1,000 most significant SNPs from single marker association analysis were selected from our previous analysis. Regions of significance were defined by identifying SNPs in linkage disequilibrium with these markers. Genes in regions of significance were evaluated for enrichment in NCI-curated pathways using a Fishers hypergeometric density function. Samples were analyzed using the Affymetrix Genotyping Console CNAT program with default parameters and the HapMap270 reference mode as well as the circular binary segmentation (CBS) algorithm. 422,062 non-overlapping genomic segments were identified in Stage 1 samples. CNV segments associated with HCC were identified using a 2x3 Fishers Exact test. Ssegments with p-values below 1x10-4 were tested in the Stage 2 samples;p-values were adjusted by Bonferroni correction. TaqMan real-time PCR assays were used to confirm the SNP6.0 CNV results for the desirable genes. Determination of copy number was performed using the standard curve method of absolute quantitation with normalization to albumin as an internal reference for copy number. For tumor/normal paired liver tissues, we identified genetic abnormality including loss-heterozygosity and copy-number variation. Data collected on gene expression, candidate loci, and somatic allele loss were integrated via hierarchical clustering of expression data, and correlation of the resulting clusters with variation at candidate susceptibility loci. Using the data analysis described above We have identified a strong association with copy number variation and two genes involved in the immune response with a (p-value &lt;1x10-15) in HCC cases when contrasted to controls. This variation appears to be somatic in origin, reflecting differences between this immune response gene and from cancer patients and healthy individuals. SNP analysis identifies other susceptibility loci including several immune related genes that is associated with HCC (pvalue = 1x10-11). Our copy number variation analysis, single marker SNP analysis results, and multi-SNP pathway analysis reveal that somatic events and germline factors affecting immune response are important in HCC susceptibility.