Advancements in the field of human genetics have created many new opportunities and challenges for statisticians. Cancer genetic studies are now routinely being conducted on a genome-wide basis. Some of the characteristics of the data include missing information and the need to analyze hundreds of thousands or even millions of markers in a single study, which puts a premium on computational speed of the methods. A fundamental problem of interest is to identify genetic risk factors that predispose some people to get a particular type of cancer. With a National Cancer Institute Mentored Research Scientist Development Award to Promote Diversity, the applicant plans to focus on developing statistical methods which seek genetic loci contributing to cancer genetics. The emphasis will be on methods that are computationally feasible for the analysis of genetic studies with millions of markers. The applicant's research will focus on developing new approaches for association testing with related individuals in structured populations. Case-control association testing has proven to be a valuable tool for the mapping of complex traits. Genetic association studies essentially seek genomic regions where the cases (individuals affected with the trait) and the controls (unaffected individuals) differ significantly. Case-control association methods, however, are not robust to population stratification, the presence of subgroups in the population with ancestry differences. A number of approaches have been proposed to control the false positive rate in samples with cryptic population structure, provided that the individuals in the sample are unrelated. Many cancer genetic association studies, however, contain related individuals and there has been little focus on methods that will correct for unknown population structure for samples with related individuals. Cryptic population and pedigree structure can lead to seriously spurious associations, and the applicant proposes using genome-screen data to infer both pedigree and population structure in the sample. Statistical methods that incorporate this structure will be developed to (1) better control the false positive rate and (2) improve the power to detect susceptibility variants in structured samples with related individuals. Statistical methods will also be developed to accommodate quantitative traits and the analysis of X-chromosome markers in cancer genetic studies for samples with related individuals. The methods will be applied to ongoing prostate cancer and haematological cancer genetic studies with collaborators in Australia, as well as a number of cancer genetic studies from the Gene Environment Association Studies (GENEVA) program with collaborators at the University of Washington. Furthermore, the applicant plans to provide implementation of the methods in freely available software, which will allow for ready use by statisticians and biologists alike and insure a broad dissemination of the methods to the scientific community. One of the applicant's future research goals is to develop statistical methods that improve the power to detect causal cancer genes by incorporating relevant environmental covariates in the model. Most types of cancers are complex disorders that are influenced by complex interactions between genes and environmental factors, and an epidemiological perspective is a very important part of understanding the etiology of complex disorders. The applicant also hopes to contribute to the area of optimal study design in cancer genetic studies as well as methods to differentiate causal markers from associated markers. Ultimately, the aim of the proposed research is to facilitate a better understanding of the complicated biological processes of cancer genetics using statistical methodology. PUBLIC HEALTH RELEVANCE: The biological process of many types of cancer is not well understood, and for this reason, identifying genes that cause or influence the disease is of extreme importance. Cancer genetic studies are now routinely being conducted on a genome-wide basis to identify regions of the genome that are involved with the disorder. We focus on developing novel statistical methodology for cancer genetic studies that have samples with related individuals, where the ancestry may not be completely known.