Background: Trachoma, caused by ocular infection with Chlamydia trachomatis, is the leading cause of infectious blindness worldwide and has been targeted for global elimination as a public health problem by 2020. This goal will be achieved in many countries, but some regions in Ethiopia maintain persistently high levels of infection despite >10 years of intensive control activities. A small proportion of the population likely harbors the majority of trachoma infections, with foci of infection (?hotspots?) at or below the village scale; the operational challenge is accurately predicting where they are with existing data. Advances in machine learning and spatial data science have demonstrated marked improvements in the spatial resolution of predictions for diseases like malaria. Among available biomarkers of trachoma, IgG antibody responses in children could enable more accurate predictions because they integrate exposure over time and reflect recent transmission. Aims: The principal aims of this study are to evaluate whether antibody measurements can identify stable hotspots of trachoma infection, and whether a novel machine learning approach can accurately predict village- level trachoma infection forward in time (up to 3 years). We hypothesize that infection will be concentrated in the population and that hotspots of infection will be at the village level. We further hypothesize that antibody measurements in young children will provide a stable source of information about trachoma transmission that will enable us to accurately predict villages with high levels of future C. trachomatis infection. Methods: To test our hypotheses, we will draw on measurements from a well characterized population across 40 villages enrolled in a NIH-funded cluster randomized trial in Ethiopia?s Amhara Region (U10-EY023939). The three-year trial is designed to measure the effect of improved water, sanitation, and handwashing (WASH) on trachoma infection in the absence of azithromycin treatment. The trial has collected clinical and biomarker measurements from approximately 2,400 children ages 0-9 years at enrollment and in annual visits over 3 years. We will characterize the spatial scale of transmission using the !-statistic, which equals the relative risk of infection within different distances of cases. We will use a permutation-based, spatial scan statistic to identify hotspots using IgG antibody and PCR measures in each year, and will determine if they are stable over time. Using geospatial ensemble machine learning, we will predict trachoma seroprevalence as a function of remotely sensed, geospatial information and limited enrollment characteristics. We will rank order villages by predicted seroprevalence, and will assess the proportion of PCR C. trachomatis infections in top-ranked villages 1, 2, and 3 years later. We will repeat the analysis using predicted clinical symptoms as a comparator. The development of methods to make accurate, fine-scale predictions of future C. trachomatis infection will lay the groundwork for a future adaptive randomized trial that preferentially allocates more intensive intervention to villages predicted at enrollment to have high future levels of infection.