The nucleotide sequence of the RNA genome of the human hepatitis C virus (HCV) has been determined from overlapping cDNA clones. The sequence (9379 nucleotides) has a single large open reading frame that could encode a viral polyprotein precursor of 3011 amino acids. While there is little overall amino acid and nucleotide sequence homology with other viruses, the 5′ HCV nucleotide sequence upstream of this large open reading frame has substantial similarity to the 5′ termini of pestiviral genomes. The polyprotein also has significant sequence similarity to helicases encoded by animal pestiviruses, plant potyviruses, and human flaviviruses, and it contains sequence motifs widely conserved among viral replicases and trypsin-like proteases. A basic, presumed nucleocapsid domain is located at the N terminus upstream of a region containing numerous potential N-linked glycosylation sites. These HCV domains are located in the same relative position as observed in the pestiviruses and flaviviruses and the hydrophobic profiles of all three viral polyproteins are similar. These combined data indicate that HCV is an unusual virus that is most related to the pestiviruses. Significant genome diversity is apparent within the putative 5′ structural gene region of different HCV isolates, suggesting the presence of closely related but distinct viral genotypes.

The nucleotide sequences of cDNAs (275 base-pairs) in the non-structural protein 5 regions of Japanese isolates of hepatitis C virus (HCV-J) from the plasma of 11 patients with non-A, non-B hepatitis and the livers of five patients with hepatocellular carcinoma were analyzed. Approximately 14 to 17% of nucleotide sequences of the HCV-Js examined differed from that of the original isolate in the United States (HCV-US). Furthermore, 2.5 to 11% sequence diversity was found among the HCV-Js. The nucleotide sequences of the HCV-Js showed characteristic common differences from that of HCV-US, although they also showed some random substitutions. Plural HCV-J genomes were found in two of the cDNAs derived from liver specimens, and a deletion of 102 nucleotides was found in the eDNA derived from one plasma specimen. These results suggest that HCV-J is a strain different from the HCV-US and that mutation of the viral genome occurs at as high a frequency as in that of the human immunodeficiency virus.