PROJECT SUMMARY/ABSTRACT The goal of this P01 project is to identify biomarkers that will enable us to predict the likely duration of the lag phase or ?remission? period prior to HIV rebound following discontinuation of antiretroviral therapy (ART) in HIV-infected individuals. In our study, a large number of virologic and immunologic parameters will be measured in a cohort of ~125 well-characterized HIV-infected individuals undergoing analytical treatment interruption (ATI), to determine if any of these measurements allow us to reliably predict the kinetics of viral rebound post-ART cessation. A large amount of high-dimensional data including next-generation sequencing data (transcriptomes and microRNA profiles) and CyTOF data will be generated in our proposed experiments. The Bioinformatics and Biostatistics Core will play a leading role in compiling, curating, analyzing and disseminating data generated in all three projects associated with this P01 application. To maximize our chances of identifying meaningful signatures predicting time until viral rebound, we will implement several statistical approaches and ensemble learning methods (e.g., gradient boosting, random forests) to develop theories, and we will rely on established classifier performance evaluation procedures (e.g. cross validation, recursive feature elimination, and feature importance measures) to rigorously determine the predictive potential of biomarkers under consideration. In Aim 1 of our Bioinformatics and Biostatistics Core project, we will evaluate the capacity of individual putative blood cell-associated biomarkers studied in Project 2 to predict time until viral rebound following ATI. Measurements include the frequency of replication-competent proviral genomes in CD4+ T cells and global characterization of the host cell transcriptome. In Aim 2, we will evaluate the capacity of individual putative cell- free plasma- and CSF-derived biomarkers studied in Project 3 to predict time until viral rebound following ATI. Measurements include circulating microRNA profile, extracellular vesicle phenotype, and multiplex cytokine and antibody characterization. Lastly, in Aim 3, we will perform a combined analysis of biomarkers across all 3 projects (including CyTOF immunophenotypic data generated in Project 1) to assess their relative performance and to identify potential synergies between predictors. Ensemble learning methods are ideal for discovering complex combinations of predictive features. They also provide a framework for evaluating the predictive importance of candidate biomarkers both individually and in combination with other biomarkers. The Bioinformatics and Biostatistics Core will play a central role in achieving our P01 objectives and in advancing the HIV cure agenda, transforming copious and diverse, high-dimensional data into robust predictors of HIV rebound following ART interruption.