Breast cancer is the most commonly diagnosed malignancy in the United States. The age-adjusted mortality rate of this cancer is more than 40% higher in African Americans (AAs) than in whites for reasons poorly understood. Since 2007, genome-wide association studies (GWAS) conducted in Asian and European descendants have identified nearly 100 susceptibility loci for this cancer. However, only a few of the initially identified risk variants can be directly replicated in AAs due to a small sample sizein previous studies and racial differences in genetic architectures and genetic/environmental modifiers. GWAS are often not equipped to study structural variants and are inefficient for capturing low-frequency variants. These variants, although virtually uninvestigated to date, are believed to contribute substantially to the heritability of breast cancer and other complex traits, particularly in African-ancestry populations. Furthermore, compared with Asian- and European-ancestry populations, the African-ancestry genome is much more heterogeneous and thus more informative, particularly as we expand the scope of genetic studies from common to less-common variants using next-generation sequencing technology. Herein, we propose a large consortium study in AAs to systematically search the whole genome to discover novel genetic susceptibility factors for breast cancer and further evaluate the influence of germline risk variants on breast cancer biology. Nearly 20,000 AA breast cancer patients and an equal number of controls will be included in this study. In Stage 1, we propose to sequence the whole genome for 1,200 breast cancer cases and 600 controls for association analyses. We will then use these sequencing data, along with data from other sources, to build a novel, comprehensive reference panel for imputation and meta-analysis of approximately 6,300 cases and 6,300 controls genotyped in four previous GWAS conducted in African-ancestry populations. We will utilize publically available genetic data, including functional genomic data, to enhance the abilit of the two aforementioned analyses to identify promising breast cancer susceptibility genes and variants for replication. In Stage 2, we will replicate approximately 60,000 promising variants in 5,500 cases and 5,500 controls. Genes/variants which show a promising association in Stage 2 will be evaluated further in Stage 3, including two additional stages (3A and 3B) in approximately 7,500 cases and 7,500 controls. Finally, we will use gene expression signatures to evaluate how germline risk variants identified in this study and previous studies affect the major signaling pathways of breast cancer. This proposed study will generate critically needed data in AAs to improve the understanding of the genetics, biology, and etiology of breast cancer.