Single nucleotide polymorphisms (SNPs) represent an abundant and useful source of genetic markers to understand complex diseases. SNPs in coding regions (cSNPs) of biologically important genes are likely to functionally alter the protein product. We have identified variants in the estrogen receptor gene (ESR1) and genotyped these variants in a cohort of breast cancer patients. The results show a haplotype generated by several polymorphisms in this gene is associated with a reduced risk of breast cancer. In addition we are characterizing variants in the ESR2 and progesterone receptor genes. While variants in these genes were not found to be associated with breast cancer, they may be relevant to other hormone responsive cancers. Using high performance computing on the Cray supercomputer, we developed new bioinformatics tools to detect large repetitive sequence blocks, such as segmental duplications. We are developing tools to use this data to determine the copy number of specific sequences, allowing insertion and deletions of large blocks of DNA to be detected. This research has applications for contiguous gene syndromes as well as to the genetic instability of cancer cells.