Cloning the human betaretrovirus proviral genome from patients with primary biliary cirrhosis


  • Data deposition: AF513913, AF513914, AF513915, AF513916, AF513917, AF513918, AF513919, AF513920, AF513921, AF513922, AF513923 AY326252 and AY326253.


Patients with primary biliary cirrhosis (PBC) have both serologic and tissue evidence of infection. A recently identified human betaretrovirus was originally cloned from the biliary epithelium cDNA library of a patient with PBC. By conducting a BLASTN search, the initial partial pol gene fragment was found to have 95% to 97% nucleotide homology with mouse mammary tumor virus (MMTV) and with retrovirus sequences derived from human breast cancer samples. Using an anti-p27CA MMTV antibody, viral proteins were detected in the perihepatic lymph nodes but not in liver tissue samples from patients with PBC, suggesting a higher viral burden in lymphoid tissue. Therefore, in the current study, we used lymph node DNA to clone the proviral genome of the human betaretrovirus from two patients with PBC using a polymerase chain reaction (PCR) walking methodology with conserved primers complementary to MMTV. The human betaretrovirus genome contains five potential open reading frames (ORF) for Gag, protease (Pro), polymerase (Pol), envelope (Env), and superantigen (Sag) proteins that are collinear with their counterparts in MMTV. Alignment studies performed with characterized MMTV and human breast cancer betaretrovirus amino acid sequences revealed a 93% to 99% identity with the p27 capsid proteins, a 93% to 97% identity with the betaretrovirus envelope proteins, and a 76% to 85% identity with the more variable superantigen proteins. Phylogenetic analysis of known betaretrovirus superantigen proteins showed that the human and murine sequences did not cluster as two distinct species. In conclusion, human betaretrovirus nucleic acid sequences have been cloned from patients with PBC. They share marked homology with MMTV and human breast cancer-derived retrovirus sequences. (HEPATOLOGY 2004;39:151–156.)

Primary biliary cirrhosis (PBC) is characterized by the histologic appearance of granulomatous destruction of small intrahepatic bile ducts and the presence of serum anti-mitochondrial antibodies (AMA) reactive with the dihydrolipoamide acetyltransferase component of pyruvate dehydrogenase complex (PDC-E2).1 The mechanism(s) that lead to loss of immune tolerance to the mitochondrial autoantigens are poorly understood. It is possible that patients with PBC make AMA because proteins resembling PDC-E2 are aberrantly expressed on the plasma membrane of interlobular duct cells and on perihepatic lymph node macrophages,2 even though the transcriptional activity of PDC-E2 in biliary epithelium cells (BEC) is not increased.3

Although the etiology of autoimmune disease remains obscure, it is believed that these disorders develop as a result of an infectious agent or environmental trigger in genetically susceptible individuals. We have explored an infectious etiology of PBC by taking two separate investigative approaches to discover a putative agent. In the first set of experiments, we found that the majority of patients with PBC had antibody reactivity to a retrovirus isolated from patients with Sjögren's syndrome; we observed viruslike particles by electron microscopy in BEC from patients with PBC; and we cloned a retrovirus pol gene sequence from a PBC BEC cDNA library.4, 5 Tissue samples showed evidence of viral infection in the majority of PBC patients and immunohistochemistry studies showed that viral proteins colocalized to cells with aberrant mitochondrial autoantigen expression.4 In the second set of experiments, we devised an in vitro model of PBC by coculturing homogenized lymph nodes from patients with PBC with normal BEC. After 7 days in culture, the BEC developed aberrant expression of the PDC-E2-like protein, a phenotypic manifestation of PBC.6 The transmissible factor promoting the PBC phenotype in serial passage was gamma radiation sensitive. In addition, it had the hydrodynamic properties of an enveloped retrovirus and the morphologic features of a B-type particle by electron microscopy, demonstrated reverse transcriptase activity, and contained the viral sequences originally detected in the PBC BEC cDNA library.4

As the proviral sequence cloned from PBC lymph nodes is related very closely to the murine betaretrovirus, mouse mammary tumor virus (MMTV), we refer to this putative PBC-related agent as the human betaretrovirus (HBRV) in accordance with the international committee on taxonomy of viruses.7 We previously characterized HBRV infection in patients with PBC and referred to the initial clonning as unpublished data.4 In the current study, we report the cloning of an additional incomplete and a complete proviral genome from the lymph nodes of two patients with PBC using a “polymerase chain reaction (PCR) walking” methodology with primers complementary to MMTV. Although our studies do not provide formal proof of a causal relation between the HBRV and the pathogenesis of PBC, the clinical relevance of our findings is underscored by other investigators who have cloned similar HBVR sequences from patients with breast cancer.8–12


AMA, anti-mitochondrial antibodies; BEC, biliary epithelial cells; HBVR, human betaretrovirus; HIAP, human intracisternal A-type particle; MMTV, mouse mammary tumor virus; ORF, open reading frame; PBC, primary biliary cirrhosis; PCR, polymerase chain reaction; PDC-E2, dihydrolipoamide acetyltransferase component of pyruvate dehydrogenase complex.

Patients and Methods

Betaretrovirus Cloning.

Perihepatic lymph nodes were collected from four PBC and six liver disease control patients at the time of liver transplantation. All patients with PBC had end-stage liver disease and none had received antiretroviral therapy. Total lymph node DNA samples were assessed by nested PCR for evidence of betaretrovirus long terminal repeat sequences, as previously described.4 Two of four patients with PBC had evidence of betaretrovirus nucleic acid sequences and one patient had sufficient DNA sample for cloning. In addition, a total lymph node DNA sample was available from one patient with PBC used in previous studies.4 Lymph node DNA samples were used as the template for nested PCR (Table 1). To confirm specificity for betaretroviral sequences, the products were resolved on an agarose gel, transferred to nitrocellulose filters, and hybridized with [32P]-labeled internal oligonucleotide primers (Southern blot). Southern blot-positive, ethidium bromide-stained PCR products were excised from the gel and cloned into the Topo 2.1 vector using a Topo cloning kit (Invitrogen, Carlsbad, CA). Five to seven clones from each product were sequenced partially from both ends from which two clones were selected and sequenced completely and bidirectionally by Lifenhancer (Baltimore, MD). Sequences were confirmed and any ambiguous regions were resolved by further sequencing on an ABI Prism 3700 capillary sequencer (Applied Biosystems, Foster City, CA) followed by analysis on Sequencher software (Gene Codes, Ann Arbor, MI) at the University of Oklahoma Health Sciences Center Laboratory for Microbial Genomics.

Table 1. PCR Cloning Methods for the Human Betaretrovirus Proviral Genome From a PBC Patient's Lymph Node DNA
Clones*Outer Primers: Length of ProductInner Primers: Length of ProductMethodsHomology
  • *

    National Center for Biotechnology Information accession numbers accompany clones with complete open reading frames.

  • Annealing temperature and duration for denaturing, annealing, and extension steps.

  • Homology by BLASTN search to full-length sequences from either MMTV or HBRV from breast cancer patients.

166-1MMLTRs1MMLTRs2Annealing: 58°C for first and second91% to 99%
 PCR product: ∼ 1.5 kbNested PCR product ∼ 1.4 kb  
168-2Mgag1Mgag3Annealing: 55°C for first and 50°C for second94% to 99%
AF513914CAGGCAAGCGAAAGGGCAAGCAATTCCGCCTCCTGGAGTTPCR: 30, 30, and 90 s for first and 30, 30, and 60 s for second 
 9910a06MMproa 1  
 PCR product ∼ 1.4 kbNested PCR product ∼ 1.1 kb.  
169-29910s069910s07Annealing: 50°C for first and second95% to 99%
 PCR product ∼ 1.0 kbNested PCR product ∼ 0.9 kb  
170-49910s099910s10Annealing: 55°C for first and 54°C for second93% to 97%
AF513916CTTGGGAGAGGTTCATTTCCATCTTACAGACGGGTCAGCAAATGGPCR: 30, 30, and 90 s for first and 20, 20, and 60 s for second 
 PCR product ∼ 1.5 kbNested PCR product ∼ 1.3 kb.  
173-1pols3pols4Annealing: 45°C for first and 46°C for second95% to 99%
AF513917GCCACAGGGTATGAAAAATAGGTATGAAAAATAGCCCTACTTTATGPCR: 20, 20, and 60 s for first and 10, 10, and 20 s for second 
 PCR product ∼ 1.3 kbNested PCR product ∼ 0.6 kb  
174-1U5s1U5s2Annealing: 55°C for first and 57°C for second95% to 99%
AF513918TCTCCGCTCGTCACTTATCCTTCGCTCGTCACTTATCCTTCACTTTCCPCR: 30, 30, and 72 s for first and 20, 20, and 60 s for the second 
 PCR product ∼ 1.1 kbNested PCR product ∼ 0.9 kb  
175-39910s039910s03Annealing: 55°C for first93% to 99%
AF513919GTCTCTCCTTGGTTTCCCGAAGGTCTCTCCTTGGTTTCCCGAAGPCR: 30, 30, and 120 s and 30, 30, and 72 s for primer sets (1) and (2) 
 (1) 9910a07Mgag4  
 PCR product ∼ 1.9 kbNested PCR product ∼ 1.4 kbPCR: 15, 15, and 72 s 
 (2) 9910a08   
 PCR product ∼ 1.5 kb.   
176-29910s119910s12Annealing: 55°C for first and 57°C for second92% to 97%
 PCR product ∼ 1.7 kbNested PCR product ∼ 1.3 kb.  
177-3Envs4Envs5Annealing: 55°C for first and 54°C for the second93% to 97%
238-869841a9842aPCR: 30, 30, and 90 s for first and second 
 PCR product ∼ 1.4 kbNested PCR product ∼ 1.4 kb  
186-2pols39901s3Annealing: 45°C for first and 55°C for second94% to 99%
AF513922GCCACAGGGTATGAAAAATAGTGAATGAGAGACTATCTACCGATAGPCR: 20, 20, and 60 s for first and 15, 15, and 30 s for the second 
 PCR product ∼ 1.3 kbNested PCR product ∼ 0.8 kb  
190-1MMLTRs1MMLTRs2Annealing: 52°C for first and second92% to 99%
AF513923AGAAATGGTTGAACTCCCGAGAGTTGTTTCCCACCAAGGACGACPCR: 30, 30, and 60 s for first and second91% to 99%
237-71PCR product ∼ 1.2 kbNested PCR product ∼ 1.1 kb.  

Sequence Analysis.

Cloned betaretrovirus sequences were aligned using the AssemblyLIGN module for MacVector (Accelrys, San Diego, CA) to form a contiguous proviral genome, assessed for complete open reading frames (ORF), and translated. Individual clones were compared with all known sequences in the Genbank and Swissprot databases using either MacVector 7.1.1 or the National Center for Biotechnology Information search tool ( for BLASTN or BLASTP searches (done June 2003).13 Phylogenetic analysis of retroviral protein sequences was performed using the ClustalW alignment program (MacVector 7.1.1). The neighbor joining method was used with 1,000 bootstrap trials performed to assess the confidence of the resulting tree.

All studies were performed with the permission of the internal review and ethics committee boards of the respective institutions.


Using PCR with primers complementary to MMTV, 9,690 bp of a proviral genome was derived from one patient with PBC (AF513913 to AF513923, inclusive) and sufficient DNA sample was available to clone the env and sag genes from a second patient (AY326253, Table 1). The assembled HBRV proviral genome has a similar structure to MMTV. It is flanked by two long terminal repeats and contains five potential collinear ORF of 100 or more codons that encode the Gag, protease (Pro), polymerase (Pol), envelope (Env), and superantigen (Sag) proteins (Fig. 1). Like MMTV, a −1 frame shift is required to generate the Gag-Pro polyprotein and a second −1 frame shift is required to generate the Gag-Pro-Pol polyprotein, whereas the env and sag genes are translated as individual ORF.

Figure 1.

Representation of the human betaretrovirus (HBVR) genome. (A) The proviral DNA sequence contains two long terminal repeats (LTR). The individual open reading frames that encode Gag (gag), protease (pro), polymerase (pol), envelope (env), and superantigen (sag) proteins are represented in frame with an arrow collinear to the proviral sequence. (B) Individual HBVR clones are aligned to the 9,690 bp proviral sequence. (PBS: primer binding site.)

For all the PBC lymph node clones, the BLASTN searches revealed closest homology and a variable 91% to 99% nucleotide identity with MMTV or the HBRV derived from breast cancer patients.10 The clones had a 0.33% to 1.58% variance from all known MMTV and HBRV nucleotide sequences deposited in the National Center for Biotechnolgy Information databases and the variance was greater in the sag and env genes than in the more conserved gag and pol genes. The heterogeneity in proviral sequences ranged from 0.3% to 10.3% in separate clones derived from individual patients. The 10.3% variability of the two clones encoding the sag gene derived from one patient is indicative of infection with HBRV quasispecies. The nucleotide variance between the respective betaretrovirus clones isolated from the lymph nodes of patients with PBC was 3.5% to 13.5%, indicating different subtypes of HBRV.

To determine the relatedness of different capsid proteins of murine and human origin, a BLASTP search was conducted using the translated PBC gag p27 gene sequence (AF513919). The amino acid sequences of capsid proteins derived from the search were aligned to show a 93% to 99% amino acid identity with the PBC and other murine and human p27 MMTV-like capsid protein sequences (Fig. 2) and a 34% to 35% distant homology to the ovine and simian betaretroviral capsid proteins. The simian retrovirus capsid proteins share 96% to 99% identity with each other but only 52% and 53% identity with the ovine and murine retrovirus capsid, showing that the capsid proteins are highly conserved among the same species (SRV1, SRV2, and MPMV) but not across species (OPAV and MTV1, Fig. 2). As the superantigen proteins show the most amino acid sequence variability between separate subtypes of MMTV, another BLASTP search was performed using the translated HBRV PBC AF513923 sag gene to construct a phylogenetic tree. One might expect that the human and murine viral sequences would group in a species-specific fashion. However, the five human Sag amino acid sequences from PBC (AF513923, AF513924 and AY326253) and breast cancer (n = 2) samples did not cluster distinctly from their murine counterparts (Fig. 3). This limited genetic analysis suggests that HBRV and MMTV cannot be distinguished as separate species at this time.

Figure 2.

Alignment of human betaretrovirus (HBVR) capsid p27 amino acid sequences using ClustalW software, MacVector 7.1.1. A BLASTP search was conducted using the translated primary biliary cirrhosis (PBC) AF513919 gag p27 coding sequence (Human-PBC) to assemble homologous capsid amino acid sequences: breast cancer HBVR p27 (Human-BCa, AAG18015); C3H, BR6, HeJ, and MTV1-related mammary mouse tumor virus p27 capsid sequences (Murine-C3H, Murine-BR6, Murine-HeJ, and Murine-MTV1); ovine pulmonary adenocarcinoma capsid (Ovine-OPAV); and simian betaretrovirus capsid amino acid sequences from simian retrovirus 1, simian retrovirus 2, and Mason Pfizer Monkey Virus (Simian-SRV1, Simian-SRV2, and Simian-MPMV).

Figure 3.

Phylogenetic analysis of betaretrovirus superantigen amino acid sequences performed using ClustalW program with a secondary bootstrap analysis showing lack of clustering of human and mouse virus proteins as separate branches.


In the current study, we cloned the complete HBRV proviral genome as well as additional env and sag genes from the lymph node samples of patients with PBC. Although similar studies have been performed in patients with breast cancer, the role of the HBRV as a human pathogen remains a controversial issue.8–12 The criticisms, in part, are related to the use of nested PCR to clone and detect the virus in human samples, which is a valid concern as the currently available human and murine sequences are insufficiently different to distinguish them as separate entities.4, 8–12 Indeed, in the coculture studies, we found that pure mouse-derived isolates of MMTV could induce the same PBC phenotype of PDC-E2 expression in BEC as human-derived tissue samples, whereas other laboratory control viruses had no such effect.4 In a previous analysis, HBRV had little variance and a 93% to 97% identity with other HBRV and murine betaretrovirus Env proteins.4 These data suggest that further studies will be required to determine whether HBRV is a separate species or an isolate from zoonotic infection by MMTV.

We recognize that PCR contamination can provide misleading data. Our original PCR studies and cloning of the PBC BEC cDNA library were performed in laboratories completely naive to MMTV sequences, virus, or MMTV-derived reagents. In the current study, full precautions were used against PCR contamination by performing nucleic acid extraction, mixing PCR reagents, and evaluating PCR products in three separate laboratories using standard good laboratory practice. It is important to note that the observed variance of the PBC HBRV clones derived to date provides adequate reassurance against a point source PCR contamination. The differences are far greater than can be attributed to sequencing or Taq polymerase errors and suggest that different subtypes of HBRV can be found in patients with PBC and breast cancer. In addition, the assembled proviral genome probably represents a mosaic of different provirus sequences or even HBRV strains. Originally, attempts were made to amplify a full-length proviral PCR product but only ∼ 1 kb clones were derived using our PCR walking methodology.

Other diagnostic techniques have provided evidence of betaretrovirus infection in humans. Patients with PBC and breast cancer have serologic reactivity to retroviral proteins, as well as immunochemistry evidence of betaretrovirus proteins in tissue samples.4, 5, 12 Electron microscopy studies have identified virions resembling mature B-type particles in milk from patients with breast cancer, in BEC isolated from PBC hepatectomy specimens, and in serial supernatants of BEC cocultured with homogenized PBC lymph nodes in vitro.4, 12, 14 These data may rekindle interest in an infectious etiology of breast cancer and the link of breast cancer with PBC.15 The natural biology of betaretroviral activation by female hormones during puberty also may explain the lack of PBC in children and the preponderance of disease in women.4

This current study also highlights the prevalence of HBRV infection in patients without PBC, as two of six control lymph node specimens used in this investigation were HBRV positive.4 Based on the model that PBC occurs as a result of infection in predisposed individuals, it is likely that many more subjects have HBRV infection than suffer from PBC. In previous studies, HBRV was only detected in one in five serum samples from patients with PBC by reverse transcription PCR but up to 87% of patients with PBC had serologic reactivity to retroviral proteins.4, 5 More effective serologic tests are now being devised to determine the true frequency of HBRV exposure in patients with liver disease and in the general population.4

In summary, the identification and cloning of HBRV from patients with PBC provide an alternative paradigm for research and management of patients with this chronic liver disease. The recognized earlier and more aggressive recurrence of PBC after transplantation with the use of greater immunosuppressive therapy and the induction of the PBC phenotype in normal BEC with the serial passage of a gamma- radiation-sensitive agent provide strong evidence for an infectious etiology of PBC.1, 4, 6 These data are also inconsistent with the notion that the aberrant AMA-reactive protein expression in BEC occurs as a result of immunoglobulin A/PDC-E2 immune complex deposition and that xenobiotics are required to break tolerance for PDC-E2.16 However, further studies using the culture model for PBC to address Koch's postulates in vitro and antiretroviral therapeutics trials to address efficacy in patients will be required to determine whether HBRV plays an etiologic role in the pathogenesis of PBC.


The authors thank R. Garry for technical advice and M. Charlton for providing tissue samples.