Project 2: Genomics and Bioinformatics Abstract The main aim of this project is to assess, enhance and develop bioinformatics capacity and tools for genome analysis in African populations and generate insights from the data that will be generated from AWI-Gen Phase 1 and Phase 2. In AWI-Gen Phase 1 we have characterized ethno-linguistic diversity in our four participating countries and six collection sites (Table 1). Sixty whole genomes were sequenced (high coverage) from the two West African populations (26 Kaseena from Ghana and 34 Mossi from Ghana and Burkina Faso respectively). We are in the process of generating WGS data (mixture of high and low coverage) for 275 more individuals (Table 2) that will enable us to have all the other major ethno-linguistic groups in our study represented by WGS data. In the AWI-Gen Phase 2, we plan to use these data to estimate the level of genetic diversity that accompanies the ethno-linguistic diversity across all our centers. WGS data will provide us with insights into the genetic diversity in these geographic regions that are highly under-represented in the existing genomic datasets. In addition it will provide us an opportunity to assess the quality of genotyping achieved by the H3African SNP array, which is enriched for common African population variation and will also provide an immensely valuable resource for imputation. Moreover, the WGS data, together with the genome-wide SNP data from the H3Africa array on our ~12,000 participants will enable us to investigate to what extent the existing genetic sub-structure might impact the association signals and the predicted risk levels in various groups. We also plan to perform an in-depth analysis of the two phases of phenotype data along with the genotype data to understand gene-gene and gene-environment interactions and also infer causal factors using Mendelian Randomization based approaches. Moreover, using this dataset, we intend to use a combination of existing and novel approaches to generate disease risk models in the ~12,000 participants and validate some of the predictions using the longitudinal data that will be generated in the next phase of the AWI-Gen study. Finally, using an in-depth comparison of the distribution genetic variants of disease and pharmacogenomics relevance in African populations we aim to provide insights that could inform precision public health approach at the level of individual participants as well as the level of the population. !