Studying the variability in the HIV genome can provide many insights into the adaptation of the virus to pressures exerted by the host and therapeutic agents. The high level of variability in HIV clouds the relationship between genotypes (i.e., the nucleotide or amino-acid sequences) and phenotypes acquired by the virus along its evolutionary path. For instance, the patterns of substitutions that confer specific levels of resistance to protease inhibitors have not been fully identified, nor have the mutations across the whole envelope gene associated with a change in cell tropism. Understanding the outcome of cross-neutralization experiments involving isolates from different continents is important for vaccine design. Here, again, the elevated substitution rate in the envelope region gives rise to polymorphisms that can erroneously be regarded as key neutralization targets. The aims of the proposed research are to apply statistical methodology to investigate these issues and to develop appropriate methods whenever the latter are lacking. In particular, statistical procedures to detect correlated mutations are outlined; they take into account the underlying phylogenetic relationships among the sequences considered. Further, methodological extensions to classification and regression trees are proposed to link phenotypes of interest to genomic information, allowing several sequences from a single individual to be included in the analysis.