The central role of the infant nasopharyngeal (NP) microbiome in the development of pneumonia or wheezing illness has been demonstrated by analysis of specific components of the microbiome in children in developed countries. No comprehensive, longitudinal study of the NP microbiome has yet been undertaken, nor related to these outcome measures. We will apply sequence-, culture- and specific PCR-based approaches to define the composition and dynamics of the infant NP microbiome from birth to 2 years of age and to determine the association between the NP microbiome and pneumonia or wheezing in African infants. We will further investigate the association between risk factors for pneumonia or wheezing and the NP microbiome. Studies will be nested within an existing, funded birth cohort study. All 500 children enrolled in the cohort will have NP sampling at birth, 6 weeks and 6-monthly for the first 2 years. In addition, NP sampling will be done at two-week intervals in a sub-group of 300 children over the first year of life. Samples will be archived for later retrieval and analysis in nested case-control studies. All cases of pneumonia, first episode wheezing or recurrent wheezing will be prospectively identified and investigated for etiology. Case-control studies of children with pneumonia or wheezing will be done. Controls matched by age, clinic site and HIV status will be selected from the same study population. Archived NP samples from cases and controls will be retrieved to perform detailed assessment of the composition and dynamics of the NP microbiome. For pneumonia cases we will focus on the samples collected in the 3 months preceding pneumonia to identify changes associated with near-term progression. For recurrent wheezing we will focus on the association with the composition of the NP microbiome at birth, 6 weeks, 6 months and 12 months of age. Techniques used to define the microbiome will include culture for common respiratory pathogens, specific multiplex molecular detection of 33 different viral, fungal and bacterial pathogens as well as sequence-based microbiome analysis for bacterial microorganisms. Analysis will use a time-series approach which will allow adjustment for seasonal variation and assessment of the evolution of diversity. In order to identify determinants of pneumonia or wheezing at each time point we will use weighted generalized ridge regression methods, which are able to select variables in a high dimension setting. We will further gauge the strength of the predictors thus identified as a function of time. In order to integrate viral ad bacterial data, both components of the microbiome will be included in the SARIMA model. Capacity building is a major focus. In particular, we propose to perform the complete microbiome pipeline within South Africa, including high throughput sequencing. Substantial emphasis is placed on training including training visits both to and from JCVI, a training workshop and training of postgraduate students. By focusing on training and ensuring that the complete pipeline (from sequence generation to final association analysis) is performed in South Africa we aim to build independent and sustainable capacity in this field.