BACKGROUND: The single-genome sequencing assay (SGS, sometimes called SGA) that we developed and published 11 years ago (Palmer et al., J. Clin. Microbiol. 43:406-413, 2005) remains the gold standard for genetic analysis of viral populations. Although next-generation sequencing (NGS) offers the potential for studying virus populations in much greater depth, PCR error, bias, and recombination during library construction have limited its use to population sequencing and measurements of unlinked allele frequencies. In this project, we are developing a new method for NGS library construction that reduces PCR bias and error, eliminates PCR recombinants from the final datasets, and generates thousands of single-genome sequences of the same quality as SGS but with 100-fold more variants. This new, ultrasensitive SGS (uSGS) assay will be used not only for detecting linkage of rare alleles, including drug-resistance mutations, but also for in-depth phylogenetic analyses of viral populations. Standard methods for NGS fall short of reproducing the most important properties of SGS, which virtually eliminates PCR artifacts but is constrained by the limited number of sequences that can be obtained (Palmer et al., J. Clin. Microbiol. 43:406-413, 2005). To address some of these issues, primer IDs (molecular tags comprising 4-10 random nucleotides) are incorporated into cDNA synthesis primers so that each cDNA molecule generated by reverse transcription is uniquely labeled (Jabara et al., PNAS 108:20166-20171, 2011). Primer ID-tagged cDNAs are then amplified by PCR and daughter amplicons are sequenced by NGS. Next, sequence reads are binned by their common primer ID, revealing PCR template resampling. Alignment of binned sequences facilitates identification of PCR errors and PCR recombination such that one consensus sequence can be generated from the alignments in each bin. Although the use of primer IDs can result in PCR errors within the primer ID itself, filtering techniques can be used to detect and exclude primer IDs with PCR errors (Zhou et al., J. Virol. 89:8540-8555, 2015). As such, primer IDs are extremely effective in identifying errors introduced during NGS library generation and provide the only means by which rare allele frequencies in HIV RNA populations may be accurately detected and measured using PCR and NGS. For library generation, NGS requires the attachment of adaptor sequences for library capture, amplification, and sequencing in the Illumina flow cell. One current method to attach these adaptors employs PCR primers containing lengthy 5'-terminal extensions. We and others showed that this long primer PCR method (LP-PCR) produces high levels of PCR recombination as well as inefficient and non-uniform amplification (Shao et al., Retrovirology 10:18, 2013; Jabara et al., PNAS 108:20166-20171, 2011).Recombinant sequences constitute a large fraction of the data, making the final results unreliable for identifying rare haplotypes or for performing accurate phylogenetic analysis. Accordingly, we are developing a new method for NGS library construction that amplifies a higher fraction of cDNA molecules with significantly reduced PCR bias and recombination. PCR errors and in vitro recombinants are detected and removed through a novel analysis pipeline, resulting in sequence data that rival the accuracy and reliability of the SGS assay but with 100-fold greater sequencing depth. Our method combines limited-cycle PCR with a highly efficient method of adaptor ligation. We call the new approach ultrasensitive SGS (uSGS). ACCOMPLISHMENTS: When we compared this new approach to conventional LP-PCR for targeted NGS library generation, we found that not only is detection of linkage among rare variants not possible using LP-PCR due to very high in vitro recombination rates, but also that the uSGS method can correct this deficiency. Our experiments completed thus far show that uSGS results in more complete sampling of cDNA libraries and, together with more stringent filtering of the sequences, provides datasets that are nearly free of PCR error and PCR recombination. Consequently, our novel uSGS methodology is the most effective means developed to date for studying HIV-1 population structure and evolution as well as for detecting linkage among rare alleles.We plan to apply our new uSGS to complete our studies on the emergence of HIV drug-resistance mutations in which we are investigating the linkage of low-frequency mutations in women with and without prior exposure to a single dose of nevirapine (sdNVP). A consequence of the high genetic diversity of HIV-1 infection in vivo is the existence of drug-resistance mutations in the replicating population before the initiation of antiretroviral therapy (ART).Our past studies included investigating the frequency with which these preexisting drug-resistance mutations are present in the plasma of treatment-naive patients and RT-SHIV-infected macaques, determining their effect on subsequent therapy, and determining the overall impact of HIV-1 diversity on treatment outcome (Boltz et al., PNAS 108:9202-9207, 2011; Boltz et al., J. Virol. 86:12525-12530, 2012; Boltz et al., J. Infect. Dis. 209:703-710, 2014). For example, we investigated the effect of low-frequency NVP-resistance mutations in women who had and did not have previous exposure to sdNVP on subsequent combination ART (A5208 Trial 1 and Trial 2). We found that women who had prior exposure to sdNVP and carried NVP-resistance mutations at a frequency above 1.0% before combination ART were at significantly higher risk of virologic failure when given NVP-containing ART than women without detectable NVP resistance. However, women who did not receive sdNVP and had low-frequency preexisting NVP-resistance mutations were not at increased risk for virologic failure, despite these mutations sometimes being present in the plasma in the same frequencies as women who had received sdNVP and failed ART. We hypothesized that the association of virologic failure with preexisting, low-frequency NVP-resistance mutations in women who had prior exposure to sdNVP was due to linkage of NVP-resistance mutations to mutations that confer resistance to other classes of inhibitors used in the combination ART. Linkage of drug-resistance mutations likely occurred in these women shortly after receiving sdNVP when the frequencies of NVP-resistance mutations were at much higher levels (up to 80% of the total virus population). The high frequency of NVP-resistant variants likely allowed for these mutations to become linked to other preexisting drug-resistance mutations in the population that arose stochastically, ultimately resulting in failure of subsequent combination ART. By contrast, NVP-resistance mutations were most likely never present in high frequencies in women who had not been exposed to sdNVP and, therefore, did not likely become linked to other preexisting drug-resistance mutations. As a result, ART was able to achieve sustained suppression in women with NVP-resistance mutations that arose stochastically rather than under the selective pressure of sdNVP. To test this hypothesis, we are developing an assay that will allow us to investigate the linkage of low-frequency drug-resistance mutations in the plasma and plan to use this assay on pre-ART samples from women in A5208 to determine their linkage profiles. We will apply our new uSGS method to complete these final studies on investigating the linkage of low-frequency mutations in women with and without prior exposure to sdNVP. After completing this trial, we will use uSGS to address questions related to the transmission, evolution, and reservoir of HIV.