Direct evidence of extensive diversity of HIV-1 in Kinshasa by 1960

Michael Worobey, Marlea Gemmel, Dirk E Teuwen, Tamara Haselkorn, Kevin Kunstman, Michael Bunce, Jean-Jacques Muyembe, Jean-Marie M Kabongo, Raphaël M Kalengayi, Eric Van Marck, M Thomas P Gilbert, Steven M Wolinsky, Michael Worobey, Marlea Gemmel, Dirk E Teuwen, Tamara Haselkorn, Kevin Kunstman, Michael Bunce, Jean-Jacques Muyembe, Jean-Marie M Kabongo, Raphaël M Kalengayi, Eric Van Marck, M Thomas P Gilbert, Steven M Wolinsky

Abstract

Human immunodeficiency virus type 1 (HIV-1) sequences that pre-date the recognition of AIDS are critical to defining the time of origin and the timescale of virus evolution. A viral sequence from 1959 (ZR59) is the oldest known HIV-1 infection. Other historically documented sequences, important calibration points to convert evolutionary distance into time, are lacking, however; ZR59 is the only one sampled before 1976. Here we report the amplification and characterization of viral sequences from a Bouin's-fixed paraffin-embedded lymph node biopsy specimen obtained in 1960 from an adult female in Léopoldville, Belgian Congo (now Kinshasa, Democratic Republic of the Congo (DRC)), and we use them to conduct the first comparative evolutionary genetic study of early pre-AIDS epidemic HIV-1 group M viruses. Phylogenetic analyses position this viral sequence (DRC60) closest to the ancestral node of subtype A (excluding A2). Relaxed molecular clock analyses incorporating DRC60 and ZR59 date the most recent common ancestor of the M group to near the beginning of the twentieth century. The sizeable genetic distance between DRC60 and ZR59 directly demonstrates that diversification of HIV-1 in west-central Africa occurred long before the recognized AIDS pandemic. The recovery of viral gene sequences from decades-old paraffin-embedded tissues opens the door to a detailed palaeovirological investigation of the evolutionary history of HIV-1 that is not accessible by other methods.

Figures

Fig. 1
Fig. 1
(A) The HIV-1 genome fragments that were successfully amplified from DRC60 (red) and available for ZR59 (black). The numbering for the HIV-1 sequences corresponds to the HXB2 reference sequence (Table S1). (B) The A/A1 subtree from the unconstrained (no molecular clock enforced) BMCMC phylogenetic analysis. Figure S1 depicts the complete phylogenetic tree (50% majority rule consensus tree of the posterior sample, with branch lengths averaged across the sample). Posterior probabilities are shown on nodes with support > 0.95. 1960.DRC60A is the University of Arizona consensus sequence, and 1960.DRC60N is the Northwestern University consensus sequence (i.e. the sequences independently recovered in each of the two laboratories). (C) Smoothed histograms of within- (A2, A/A1, B, C, D, F1, F2, H, J, K) and between-subtype distances.
Fig. 2
Fig. 2
Maximum clade credibility topology inferred using BEAST v1.4.7 under a Bayesian skyline plot tree prior. Branch lengths are depicted in unit time (years) and represent the median of those nodes that were present in at least 50% of the sampled trees. DRC60 (red), ZR59 (black), and the three control sequences from paraffin-embedded specimens from known AIDS patients (gray) are depicted in bold. The 95% HPD of the TMRCA is indicated at the root of the tree. Nodes (sub-subtype and deeper) with posterior probability of 1.0 are marked with a large circle. Unclassifiable strains are labeled ‘U’. Sequences sampled in the DRC are highlighted with a bullet at the tip. DRC60 and the two control sequences from the DRC each form monophyletic clades with previously published sequences from the DRC, whereas the Canadian control sequence clusters, as expected, with subtype B sequences. The dashed circle and shaded area show the extensive HIV-1 diversity in Kinshasa in the 1950s. Figure S2 shows the tree in rectangular form with taxon labels.
Fig. 3
Fig. 3
The origin and growth of the major settlements near the epicenter of the HIV-1 group M epidemic. In the countries surrounding the putative zone of cross-species transmission (current-day Cameroon, Central African Republic, DRC, Republic of Congo, Gabon, and Equatorial Guinea) there was not a single site with a population exceeding 10,000 until after 1910. The founding date of each major city in the region is listed beside its name. Most were founded only shortly before the estimated TMRCA of group M. The demographic data come from reference .

Source: PubMed

3
Sottoscrivi