BSEP and MDR3 haplotype structure in healthy Caucasians, primary biliary cirrhosis and primary sclerosing cholangitis

Authors


Abstract

Primary biliary cirrhosis (PBC) and primary sclerosing cholangitis (PSC) are characterized by a cholestatic pattern of liver damage, also observed in hereditary or acquired dysfunction of the canalicular membrane transporters bile salt export pump (BSEP, ABCB11) and multidrug resistance protein type 3 (MDR3, ABCB4). Controversy exists whether a genetically determined dysfunction of BSEP and MDR3 plays a pathogenic role in PBC and PSC. Therefore, 149 healthy Caucasian control individuals (control group) were compared to 76 PBC and 46 PSC patients with respect to genetic variations in BSEP and MDR3. Sequencing spanned ∼10,000 bp including promoter and coding regions as well as 50–350 bp of flanking intronic regions. In all, 46 and 45 variants were identified in BSEP and MDR3, respectively. No differences between the groups were detected either in the total number of variants (BSEP: control group: 37, PBC: 37, PSC: 31; and MDR3: control group: 35; PBC: 32, PSC: 30), or in the allele frequency of the common variable sites. Furthermore, there were no significant differences in haplotype distribution and linkage disequilibrium. In conclusion, this study provides an analysis of BSEP and MDR3 variant segregation and haplotype structure in a Caucasian population. Although an impact of rare variants on BSEP and MDR3 function cannot be ruled out, our data do not support a strong role of BSEP and MDR3 genetic variations in the pathogenesis of PBC and PSC. (HEPATOLOGY 2004;39:779–791.)

Primary biliary cirrhosis (PBC) and primary sclerosing cholangitis (PSC) are chronic hepatic disorders which slowly progress to endstage liver failure in most of the affected patients. The inflammatory process of PBC affects predominantly the small bile ducts, while PSC is characterized by inflammation, fibrosis, and obstruction of large- and medium-sized intra- and extrahepatic ductuli.1–4

The etiology and pathogenesis of PBC and PSC are still unknown, but multiple mechanisms are likely to play a role. First, a genetic predisposition is supported by an increased prevalence in families with one affected member5 and the overrepresentation of certain HLA-alleles in PBC- and PSC-patients.6–8 Second, a contribution of immunological mechanisms can be suspected based on the findings of a tight association of PBC and PSC with the detection of autoantibodies such as antimitochondrial antibodies (AMA) in PBC or antineutrophil cytoplasmatic antibodies (ANCA) in PSC.6, 9 Third, the inflammatory reaction in liver and bile ducts has been hypothesized to result from a recurrent inflammatory stimulus through entry of infectious microorganisms into the portal circulation.2, 10, 11

In addition to an immunologic insult, a major pathogenic role has been attributed to the intracellular accumulation of toxic bile acids,12, 13 which can cause concentration-dependent cellular damage.14, 15 Under physiological conditions the secretion of bile salts and other bile constituents across the canalicular membrane of hepatocytes is mediated by various ABC (ATP-binding cassette)-transporters. Two members of the family of multidrug resistance P-glycoproteins, the bile salt export pump BSEP (ABCB11) and the phospholipid translocase MDR3 (ABCB4), are the main players in this process.16 BSEP is the predominant bile salt efflux system of hepatocytes and mediates the cellular excretion of numerous conjugated bile salts. MDR3 was shown to act as an ATP-dependent phospholipid flippase, translocating phosphatidylcholine from the inner to the outer leaflet of the canalicular membrane.16 Inherited dysfunction of BSEP and MDR3 could be established as the underlying causes of familial cholestatic syndromes, such as progressive familial intrahepatic cholestasis type II and III, respectively. Furthermore, inherited MDR3 alterations have been proposed to contribute to intrahepatic cholestasis of pregnancy.17–20 Whether a genetically determined functional disturbance of BSEP and MDR3 might also play a role in the pathogenesis of PBC and PSC is subject of ongoing discussion. There are two lines of evidence suggesting a contribution of an inherited dysfunction of these proteins to the development of PBC and PSC. First, although elevated serum bile acids in these patients could be a secondary phenomenon following the destruction of bile ducts and hepatocellular damage, a primary contribution of preexisting BSEP- or MDR3 dysfunction cannot be ruled out.12, 13 Second, mdr2-knockout mice, which lack expression of the human MDR3 homolog, develop liver lesions resembling sclerosing cholangitis characterized by biliary strictures and dilatations.21 Furthermore, a missense mutation in codon 535 of exon 14 of the MDR3 gene was recently associated with a familial cholestatic syndrome, leading to adulthood biliary cirrhosis in one of the affected carriers.22 Hence, primary MDR3 dysfunction could lead to decreased biliary phospholipid concentrations, thereby increasing toxic bile duct damage through the excess of unsolubilized intrabiliary bile salts.

In this study, sequence diversity and haplotype structure in BSEP and MDR3 was first established in a collective of healthy Caucasian individuals. We then tested the hypothesis that BSEP and MDR3 variant segregation and haplotype structure in PBC and PSC patients is different from that observed in healthy individuals with the same ethnic background.

Abbreviations

PBC, primary biliary cirrhosis; PSC, primary sclerosing cholangitis; BSEP, bile salt export pump; MDR3, multidrug resistance 3.

Patients and Methods

Healthy Controls and Patient Characteristics.

After ethical approval and written informed consent, blood samples for DNA extraction were obtained from 149 Caucasian volunteers and 76 PBC as well as 46 PSC patients of Caucasian origin. DNA samples from Caucasian volunteers (control group) were obtained from healthy individuals selected for participation in Phase I clinical trials. Two different cohorts of healthy volunteers were used for BSEP and MDR3 sequencing, resulting in a total of 93 individuals (male: 73, female: 20; age, 37.5 ± 1.6 years; height: 175.5 ± 9.0 cm; weight: 76.5 ± 13.4 kg) and 56 individuals (male: 47, female: 9; age: 38.2 ± 14.1 years; height: 176.9 ± 8.5 cm; weight: 75.3 ± 10.8 kg) sequenced for BSEP and MDR3, respectively. DNA samples from PBC and PSC patients were from the PBC and PSC cohort at the Department of Medicine II, Klinikum Grosshadern, University of Munich. Identical PBC and PSC DNA samples were analyzed for the presence of BSEP and MDR3 sequence variability. Due to scarce DNA sampling, 6 PBC samples were only sequenced for MDR3 but not for BSEP, yielding 70 and 76 PBC samples sequenced for the presence of BSEP and MDR3 genetic variation, respectively.

Diagnosis of PBC was based on a cholestatic serum enzyme pattern in the presence of antimitochondrial antibodies in serum and compatible histological features in a liver biopsy specimen.23 Diagnosis of PSC was based on a cholestatic serum enzyme pattern, typical cholangiographic findings of strictures and dilatations of small and large bile ducts, and compatible histological features in a liver biopsy specimen.1 The female/male ratio was 76/0 and 12/34 for PBC and PSC patients, respectively. The age at inclusion was 58 ± 10 for PBC and 39 ± 12 for PSC patients. Bilirubin values and Mayo Risk Scores as strongest outcome parameters in PBC and PSC24, 25 are given in Table 1. Mayo Risk Scores were calculated according to Murtaugh et al.24 and Kim et al.25 for PBC and PSC, respectively, which are both based on clinical and biochemical markers of prognostic value.

Table 1. Biochemical and Clinical Parameters of Prognostic Value in PBC and PSC Patients
 PBCPSC
BSEPMDR3BSEPMDR3
BilirubinMayo RS24BilirubinMayo RS24BilirubinMayo RS25BilirubinMayo RS25
n = 66n = 59n = 72n = 65n = 43n = 32n = 43n = 32
  1. NOTE. Bilirubun values are given in mg/dl. The upper normal limit for bilirubin is 1 mg/dl; n indicates the number of subjects for whom the respective parameters were available. Mayo Risk Scores (Mayo RS) were calculated according to Murtaugh et al.24 and Kim et al.25 for PBC and PSC, respectively, including patients' age, bilirubin and albumin values, prothrombin time and the presence of edema for PBC and patients' age, bilirubin, albumin and aspartate aminotransferase levels and history of variceal bleeding for PSC.

Mean ± SD1.66 ± 3.365.17 ± 1.862.15 ± 5.155.27 ± 2.131.07 ± 1.17−1.12 ± 0.971.07 ± 1.17−1.12 ± 0.97
Range (min; max)0.24; 21.61.86; 10.480.24; 35.121.13; 13.140.20; 7.80−3.08; 2.410.20; 7.80−3.08; 2.41

Sequencing.

Genomic and cDNA sequences were derived from known sequences (BSEP: GenBank accession number AC008177.3 for promoter and exon 1–21, AC069165.2 for exon 22–28 and NM_003742 for cDNA; MDR3: AC005068.2 for promoter and noncoding exons –3 to 1, AC006154.1 for exons 4–12, AC0005045.2 for exons 13–28 and NM_000443 for cDNA). Primers for genomic DNA were designed to span exons and at least 50 bp of flanking intronic sequences at the 5′ and 3′ ends. The DNA sequence of purified PCR fragments were obtained on an ABI3700 capillary sequencer (ABI, Weiterstadt, Germany) and assembled using the phredPhrap, consed, and polyphred software (University of Washington). Details regarding the primers, optimized PCR conditions, and subsequent purification and sequencing of the fragments are available at info@epidauros.com.

Haplotype Analysis.

Haplotypes were statistically inferred using an algorithm based on Bayesian inference.26 PHASE calls were made using the entire sample population. Variant sites were used for haplotype analysis only if they were present in two or more chromosomes in the entire sample set. Haplotypes were inferred by running PHASE a total of 10 times.

We did not assign haplotype names, as nomenclature changes can be anticipated from the expected identification of additional SNPs in BSEP and MDR3 through resequencing of bigger sample sets and populations with different ethnic backgrounds. However, to allow referral to special haplotypes in this article, a frequency-based priority criterion was used to name important haplotypes (i.e., BSEP_1 for the most frequent BSEP haplotype).

Linkage Disequilibrium and Inference of Intragenic Recombination.

Linkage disequilibrium for each of the pairs of segregating sites was quantified by the D′ and r2 statistics and the significance of the associations was assessed by Fisher's exact test. Analysis of the inferred haplotypes was performed with the DNAsp package.27

Statistical Analysis.

To detect differences in haplotype distribution between groups, a Fisher's exact test was performed for each haplotype separately. PBC and PSC patients were compared to healthy controls. Haplotypes encountered significantly more frequently in PBC or PSC patients by Fisher's exact test were further assessed through ROC (receiver operating characteristic) analysis to explore the power to accurately predict the diagnosed disease.

Furthermore, logistic regression analysis was used to test for an association between two prognostic markers in PBC and PSC and BSEP and MDR3 haplotypes: a) bilirubin and b) the Mayo Risk Score. Because of a heavily skewed distribution, bilirubin values were logarithmically transformed prior to analysis. Only haplotypes observed at least 10 times in the patient cohort were included in the analysis. Calculations were done with the StatView 5.01 software (SAS, Cary, NC) and SPSS 11.5 for Windows (Chicago, IL).

Results

DNA samples from 93 and 56 healthy Caucasians, 70 and 76 PBC-patients, and 46 and 46 PSC-patients were screened for genetic variations in BSEP and MDR3, respectively. NM_003742 was used as BSEP reference sequence, which was derived from the BSEP sequence published by Strautnieks et al.17 NM_000443 was used as the MDR3 reference sequence, which was derived from the MDR3 sequence first published by van der Bliek et al. in 1988.28 This variant, denominated variant A of the published MDR3 sequences, represents the predominant form of the gene and lacks both the in-frame insertion found in variants B and the inframe deletion found in variant C.

BSEP Sequence Variability.

A total of 46 variant sites were found, all of which were in Hardy-Weinberg equilibrium (Table 2). Forty-five variant sites were single nucleotide substitutions, which had only two alternative nucleotides. One variant site in the promoter was a deletion of 4 bp. In the entire collective, 15 variants were in the promoter region, 19 were intronic variants, and 12 were found in the coding region. Thirteen of the variants were only found in a single chromosome; 4 were detected as doubletons. The 12 coding region variants included 5 synonymous and 7 nonsynonymous changes. Six of these nonsynonymous variants were located in the intracellular loop of BSEP and 1 in the transmembrane region 4 (Fig. 1). Five coding region changes were single nucleotide polymorphisms present in all of the studied populations at an allele frequency of >1% (synonymous: exon 5: T270C, exon 10: A957G, exon 24: G3084A; nonsynonymous: exon 13: T1331C → V444A and exon 17: A2029G → M677V). Nonsynonymous changes observed as singletons or doubletons encoded the following amino acid changes: S194P, G260D, V284A, R698H, and A1228V. Alignment of all mammalian BSEP sequences indicated that 5 of the 6 nonsynonymous coding variants were in codons for an evolutionarily conserved amino acid (S194P, V284A, V444A, R698H, and A1228V) (Table 2).

Table 2. BSEP Genetic Variability
Variant NumberAmpliconDNA PositioncDNA PositionNucleotide ReferenceNucleotide VariantAA ChangeControl Group n = 186PBC n = 140PSC n = 92
  1. NOTE. Variants are numbered sequentially by exon. For example, variant 2.1 is the first variant in amplicon 2; cDNA numbers are relative to the ATG site and based on the cDNA sequence from GenBank accession number NM_003742. The promoter sites are relative to noncoding exon 1. The amino acid (AA) position is indicated for those variants in the coding exonic sequence. Numbering of intronic variants is relative to the corresponding exon. Intronic variants are designated with (−) or (+) for mutations located upstream or downstream of an exon, respectively. Nucleotide changes in the intronic region are from the accession numbers AC008177.3 (promoter and exon 1–21) and AC069165.2 (exon 22–28).

  2. Abbreviations: c, evolutionary conserved amino acid; pro, promoter site; n/a, not available; PBC, primary biliary cirrhosis; PSC, primary sclerosing cholangitis.

  3. n is the number of chromosomes in each group. Allele frequencies were calculated for the total population and each individual group: control group.

Pro 1 intron-1(−2397)CT 0.0050.0000.000
Pro 2 intron-1(−2080)CTCTdelCTCT 0.1020.0490.048
Pro 3 intron-1(−1994)TC 0.0000.0080.000
Pro 4 intron-1(−1952)TC 0.4800.4250.405
Pro 5 intron-1(−1820)GA 0.1010.0420.048
Pro 6 intron-1(−1814)CT 0.0000.0080.000
Pro 7 intron-1(−1746)GA 0.1010.0540.048
Pro 8 intron-1(−1275)GA 0.0250.0340.057
Pro 9 intron-1(−1239)GA 0.2980.2920.244
Pro 10 intron-1(−1155)TC 0.6160.5650.595
Pro 11 intron-1(−1118)CA 0.0050.0000.000
Pro 12 intron-1(−1009)TC 0.0210.0330.060
Pro 13 intron-1(−906)CT 0.0260.0160.036
Pro 14 intron-1(−837)AG 0.0150.0080.012
Pro 15 intron-1(−603)TC 0.0150.0090.000
2.1.amplicon 2intron 1(−50)GA 0.0250.0290.054
3.1.amplicon 3intron 2(−37)TG 0.0000.0000.022
4.1.amplicon 4intron 3(−20)TC 0.0710.0430.076
5.1.amplicon 5exon 5270TCsyn0.0270.0360.064
5.2.amplicon 5intron 5(+8)GA 0.0490.0190.026
7.1.amplicon 7intron 6(−77)AT 0.0050.0080.000
7.2.amplicon 7exon 7580TCS194P_c0.0000.0000.011
8.1.amplicon 8intron 7(−41)TC 0.0050.0000.000
8.2.amplicon 8exon 8779GAG260D0.0000.0080.000
9.1.amplicon 9exon 9851TCV284A_c0.0050.0000.000
10.1.amplicon 10intron 9(−69)CT 0.0510.0160.011
10.2.amplicon 10intron 9(−31)CT 0.0460.0150.022
10.3.amplicon 10intron 9(−24)GA 0.0000.0000.011
10.4.amplicon 10intron 9(−17)GA 0.0460.0150.022
10.5.amplicon 10intron 9(−15)AG 0.7040.7090.739
10.6.amplicon 10exon 10957AGsyn0.0460.0150.022
12.1.amplicon 12intron 12(+73)GT 0.0100.0000.000
13.1.amplicon 13exon 131331TCA444V_c0.5950.6320.600
13.2.amplicon 13intron 13(+70)CT 0.5900.6230.600
14.1.amplicon 14exon 141530CAsyn0.0000.0070.000
14.2.amplicon 14intron 14(+32)TC 0.6040.6290.589
17.1.amplicon 17exon 172029AGM677V0.0420.0160.014
18.1.amplicon 18exon 182093GAR698H_c0.0050.0080.000
18.2.amplicon 18exon 182134TCsyn0.0050.0310.012
19.1.amplicon 19intron 18(−17)CA 0.4320.4180.565
20.1.amplicon 20intron 19(−17)TC 0.6940.6550.702
21.1.amplicon 21intron 21(+18)AC 0.0060.0000.000
24.1.amplicon 24exon 243084GAsyn0.4590.5630.398
27.1.amplicon 27Exon 273683CTA1228V_c0.0000.0080.000
28.1.amplicon 28intron 27(−34)GA 0.4480.5690.378
28.2.amplicon 283′UTR(+82)GT 0.0000.0080.000
Figure 1.

Secondary structure of BSEP with nonsynonymous coding region SNPs. The transmembrane topology schematic was rendered using TOPO (S.J. Johns and R.C. Speth, transmembrane protein display software, http://www.sacs.ucsf.edu/TOPO/topo.html, unpubl.). Common variants are shown in yellow, variants specific for the control group are shown in blue, PBC-specific variants in green, and PSC-specific variants in red.

The total number of variant sites was similar in the three groups (37 in the control group, 37 in PBC- and 31 in PSC-patients). Twenty-eight of the segregating sites were found in all groups. Six, 6, and 3 were specific for healthy individuals, PBC, and PSC patients, respectively. Of the coding region variants, 7 were detected in at least 2 of the studied populations (Table 2). One variant (T851C → V284A) was only found in the control group, 3 were specific to PBC patients (G779A → G620D, C1530A and C3683T → A1228V), and 1 to PSC patients (T580C → S194P).

MDR3 Sequence Variability.

A total of 45 variant sites were found, all of which were in Hardy-Weinberg equilibrium (Table 3). Forty variable sites were single nucleotide substitutions, which had only two alternative nucleotides. Three variable sites were deletions of 1 bp in intron 11, intron 22, and intron 23. One was a deletion of 4 bp in intron 5 and 1 an insertion of 1 bp in intron 26. Twelve variants were in the promoter region. Outside the promoter, 19 variants were intronic, including 1 splicing variant, and 14 were found in the coding region. Fifteen of the variants were only found in a single chromosome and 4 were detected as doubletons. The 14 coding region variants included 4 synonymous and 10 nonsynonymous changes. Nine nonsynonymous variants were located in the intracellular loop of MDR3 and one in the transmembrane region 1 (Fig. 2). Five coding region changes were single nucleotide polymorphisms (synonymous: exon 4: C175T, exon 6: C504T, exon 8: A711T; nonsynonymous: exon 6: A523G → T175A and exon 16: A1954G → R652G). Nonsynonymous changes observed as singletons in our sample set coded for the following amino acid changes: L73V, D243A, I367V, K435T, E1099G, and G1251Q. Alignment of all mammalian MDR3 sequences indicated that with the exception of R652G all nonsynonymous coding variants were in codons for an evolutionarily conserved amino acid.

Table 3. MDR3 Genetic Variability
Variant NumberAmpliconDNA PositioncDNA PositionNucleotide ReferenceNucleotide VariantAA ChangeControl Group n = 112PBC n = 152PSC n = 92
  1. NOTE. Variants are numbered sequentially by exon. For example, variant 4.1 is the first variant in amplicon 4; cDNA numbers are relative to the ATG site and based on the cDNA sequence from GenBank accession number NM_000443. The promoter sites are relative to noncoding exon 1. The first three non-coding exons are numbered −3, −2 and −1. The amino acid (AA) position is indicated for those variants in the coding exonic sequence. Numbering of intronic variants is relative to the corresponding exon. Intronic variants are designated with (−) or (+) for mutations located upstream or downstream of an exon, respectively. Nucleotide changes in the intronic region are from the accession numbers AC005068.2 (Promoter and non-coding exons −3 to 1), AC006154.1 (exon 4–12) and AC0005045.2 (exon 13–28).

  2. Abbreviations: c, evolutionary conserved amino acid; pro, promoter site; PBC, primary biliary cirrhosis; PSC, primary sclerosing cholangitis.

  3. n is the number of chromosomes in each group. Allele frequencies were calculated for the total population and each individual group: control group.

Pro1amplicon −3intron −4(−394)TG 0.0480.0520.058
Pro2amplicon −3exon −3(−410)TCnon-coding0.0490.0510.071
Pro3amplicon −3exon −3(−387)CAnon-coding0.0080.0000.000
Pro4amplicon −2exon −2(−229)CTnon-coding0.1140.1030.152
Pro5amplicon −1intron −2(−221)CT 0.1110.1090.152
Pro6amplicon −1intron −2(−204)AG 0.1350.1010.152
Pro7amplicon −1exon −1(−44)ACnon-coding0.0160.0000.000
Pro8amplicon 1intron −1(−301)CG 0.1750.1470.217
Pro9amplicon 1intron −1(−220)CT 0.0480.0510.065
Pro10amplicon 1intron −1(−201)CG 0.1350.1100.163
Pro11amplicon 1intron −1(−184)TC 0.2620.3530.233
Pro12amplicon 1intron −1(−86)CA 0.0080.0000.000
4.1.amplicon 4exon 4175CTsyn0.1270.1230.156
4.2.amplicon 4exon 4217CGL73V_c0.0000.0070.000
5.1.amplicon 5intron 4(−92)CT 0.0340.0070.033
5.2.amplicon 5intron 5(+113)AG 0.1640.1670.217
6.1.amplicon 6intron 5(−66t to −62)AGAAAdelAGAAA 0.0790.0210.014
6.2.amplicon 6exon 6504CTsyn0.5710.4930.524
6.3.amplicon 6exon 6523AGT175A_c0.0320.0140.012
7.1.amplicon 7intron 6(−56)TC 0.0000.0070.000
7.2.amplicon 7intron 6(−44)CA 0.0000.0000.012
8.1.amplicon 8intron 7(−61)CG 0.0080.0000.000
8.2.amplicon 8exon 8711ATsyn0.1510.1690.239
8.3.amplicon 8exon 8728ACD243A_c0.0000.0070.000
9.1.amplicon 9exon 9927TCsyn0.0080.0000.000
10.1.amplicon 10exon 101099AGI367V_c0.0080.0000.000
12.1.amplicon 12intron 11(−88)TdelT 0.0700.0810.116
12.2.amplicon 12exon 121304ACK435T_c0.0000.0070.000
12.3.amplicon 12intron 12(+130)GT 0.0650.0830.100
13.1.amplicon 13intron 12(−40)AG 0.9520.9440.930
14.1.amplicon 14exon 141633CTR545C_c0.0000.0000.011
15.1.amplicon 15intron 14(−39)AG 0.0080.0070.011
15.2.amplicon 15exon 151769GAR590Q_c0.0080.0000.000
15.3.amplicon 15intron 15(+6)TCsplicing0.0000.0070.000
16.1.amplicon 16exon 161954AGR652G0.0730.0860.128
16.2.amplicon 16intron 16(+55)AG 0.0530.0710.128
17.1.amplicon 17intron 17(+16)CT 0.9510.9130.955
20.1.amplicon 20intron 20(+40)AG 0.0630.0920.102
23.1.amplicon 23intron 22(−38)TdelT 0.0080.0000.000
23.2.amplicon 23intron 23(+92)GdelG 0.0000.0000.032
26.1.amplicon 26exon 263296AGE1099G_c0.0180.0000.000
26.2.amplicon 26intron 26(+11) insA 0.0000.0000.024
27.1.amplicon 27intron 26(−16)TC 0.9180.9110.939
28.1.amplicon 28intron 27(−72)TC 0.0730.0900.065
28.2.amplicon 28exon 283751ACG1251Q_c0.0000.0070.000
Figure 2.

Secondary structure of MDR3 with nonsynonymous coding region SNPs. The transmembrane topology schematic was rendered using TOPO (S.J. Johns and R.C. Speth, transmembrane protein display software, http://www.sacs.ucsf.edu/TOPO/topo.html, unpubl). Common variants are shown in yellow, variants specific for the control group are shown in blue, PBC-specific variants in green, and PSC-specific variants in red.

The total number of variant sites was similar in the three groups (35 in the control group, 32 in PBC- and 30 in PSC-patients). Twenty-six of the segregating sites were found in all groups, 9, 6, and 4 were specific for healthy individuals, PBC, and PSC patients, respectively. Of the coding region variants, 5 were detected in all of the studied populations (Table 3). Four variant sites (T927C, A1099G → I367V, G1769A → R590Q, and A3296G → E1099G) were only found in the control group, 4 were specific to PBC patients (C217G → L73V, A728C → D243A, A1304C → K435T, and A3751C → K1251Q), while 1 was PSC-specific (C1633T → R545C). The splicing variant in intron 15 was also specific for PBC patients.

Haplotype Analysis.

Thirty-three and 30 variants were used for PHASE analysis of BSEP and MDR3, respectively. Haplotypes were arranged according to sequence similarity, as illustrated in Figures 3 and 6. Haplotypes could not be assigned for 1 individual for BSEP (PSC group), while assignment was possible in all individuals in the MDR3 sample set.

Figure 3.

Alignment and population frequencies of BSEP haplotypes. A total of 54 distinct haplotypes were assigned for 416 chromosomes using the program PHASE. Haplotypes were aligned according to sequence similarity. P-values indicate differences in haplotype frequency between PBC and PSC patients compared to healthy controls calculated by Fisher's exact test. Black boxes denote the reference sequence; the filled squares represent promoter (blue) intronic (yellow), synonymous (green), and nonsynonymous (red) segregating sites. Population frequencies are noted. R: reference; D: deletion. Haplotypes were called for 93 healthy Caucasian controls, 70 patients with PBC, and 45 patients with PSC.

Figure 6.

Alignment and population frequencies of MDR3 haplotypes. A total of 37 distinct haplotypes were assigned for 356 chromosomes using the program PHASE. Haplotypes were aligned according to sequence similarity. P-values indicate differences in haplotype frequency between PBC and PSC patients compared to healthy controls calculated by Fisher's exact test. Black boxes denote the reference sequence. Black boxes denote the reference sequence; the filled squares represent promoter (blue) intronic (yellow), synonymous (green), and nonsynonymous (red) segregating sites. Population frequencies are noted. R: reference, D: deletion, I: insertion; Haplotypes were called for 56 healthy Caucasian controls, 76 patients with PBC, and 46 patients with PSC.

The BSEP and MDR3 reference sequences correspond to NM_003742 and NM_000443 for BSEP and MDR3 cDNA and to the BAC clone sequences mentioned in Patients and Methods for intronic and promoter sequences. As this composite reference sequence is assembled from different sources, it was not encountered in our study population for BSEP and MDR3.

BSEP Haplotype Structure.

The 33 variant sites segregated as 54 distinct haplotypes. Only 30 of these haplotypes were present in 3 or more chromosomes. The percentage of chromosomes in the entire population that could be assigned 1 of these common haplotypes was 92% and was similar for all studied populations (90.9%, 93.6%, and 92.4% for the control group, PBC, and PSC patients, respectively). Seven major haplotypes, BSEP_1 to BSEP_7, were found in our population samples, covering >57% of the chromosomes (Figs. 3, 4). These 7 major haplotypes were equally distributed among the studied populations (control group: 55.4%, PBC: 62.9%, PSC: 54.4%).

Figure 4.

Distribution of major BSEP haplotypes. The distribution of the 7 most common haplotypes is shown. The populations are as described for Figure 3: healthy Caucasian controls: black bars; patients with PBC: hatched diagonal lines; patients with PSC: white dots.

There was some variability in the number and frequency of haplotypes observed in a population. A total of 44 haplotypes was observed in healthy controls, compared to 32 in PBC and 32 in PSC patients. There were 14 healthy control-specific haplotypes, covering 12.4% of this sample set. Five and 2 haplotypes were specific for the PBC and PSC collective, respectively, yielding 4.3% and 3.3% of alleles analyzed in the respective groups. Most of these unique haplotypes differed only in promoter or intronic sequence from common alleles. Two haplotypes (BSEP_8 and BSEP_9) contained a combination of coding region SNPs (cSNPs) only found in the control group and the PBC group, respectively (Fig. 3). One haplotype (BSEP_10) was found to be significantly more frequent in PBC patients compared to healthy Caucasian controls (Fig. 3). However, this haplotype was observed only in 5 individuals and thus belongs to the rarely observed ones. Correspondingly, ROC analysis showed that inclusion of this haplotype would increase the correct assignment of PBC from 50% to only 54% (95% confidence interval (CI): 44%; 63%). Furthermore, an association between risk markers for disease and BSEP haplotypes was tested for 5 out of 54 BSEP haplotypes by logistic regression analysis. High values for logarithmically transformed serum bilirubin and Mayo Risk Scores were found to be associated with haplotype BSEP_4 in the PBC collective. Differences were significant for PBC patients carrying the BSEP_4 haplotype versus patients who did not, with odds ratios (OR) of 1.961 (95% CI (1.054; 3.650)) per unit of logarithmically transformed bilirubin and 1.398 (95% CI (1.006; 1.919)) per unit of Mayo Risk Score (Fig. 5).

Figure 5.

Box plot of logarithmically transformed bilirubin and Mayo Risk Score values in PBC patients with and without BSEP_4 haplotype. Differences were significant for PBC patients carrying the BSEP_4 haplotype versus patients who did not, with ORs of 1.961 (95% CI (1.054; 3.650)) per unit of logarithmically transformed bilirubin and 1.398 (95% CI (1.006; 1.919)) per unit of Mayo Risk Score.

MDR3 Haplotype Structure.

The 30 variant sites segregated as 37 distinct haplotypes. Only 16 of these haplotypes were present in 3 or more chromosomes. The percentage of chromosomes in the entire population that could be assigned one of these common haplotypes was 91% and was slightly lower in PSC-patients (91.1%, 97.4%, and 84.8% for the control group, PBC, and PSC patients, respectively). Six common haplotypes, MDR3_1 to MDR3_6, were found in the entire sample population, covering 75.9% of the chromosomes (Figs. 6, 7). These major haplotypes were found at similar frequencies in all studied populations (control group: 76.8%, PBC: 80.3%, PSC: 70.7%). There was some variability in number and frequency of haplotypes observed in the 3 populations. A total of 20 haplotypes were observed in the control group, compared to 20 and 24 in PBC and PSC patients, respectively. There were 6 haplotypes specific for the control group and 3 PBC-specific haplotypes, accounting for 8% and 2% of the healthy control and PBC sample set, respectively. Eleven specific haplotypes were identified in the PSC-cohort, accounting for 15.2% of the total sample set. While the PBC-specific alleles differed only in promoter and intronic sites from common haplotypes, 3 of the control group-specific haplotypes (MDR3_7 to MDR3_9) and three PSC-specific haplotypes (MDR3_10 to MDR3_12) carried a unique combination of cSNPs (Fig. 6). One haplotype (MDR3_1) was observed more frequently in PSC patients as compared to healthy controls (Fig. 6). This haplotype is quite common in all groups (control group: 33.9%, PBC: 35.5%, PSC: 44.6%). ROC analysis showed that this haplotype would increase the correct assignment of disease from 50% to 53% (95% CI: 54%; 73%). Furthermore, an association between risk markers for disease and MDR3 haplotypes was tested for 4 out of 37 MDR3 haplotypes by logistic regression analysis. Analysis did not show any significant association between PBC and PSC subphenotypes and MDR3 haplotypes.

Figure 7.

Distribution of major MDR3 haplotypes. The distribution of the six most common haplotypes is shown. The populations are as described for Figure 3: healthy Caucasian controls: black bars; patients with PBC: hatched diagonal lines; patients with PSC: white dots.

Intragenic Recombination and Linkage Disequilibrium.

Linkage disequilibrium (LD) was evaluated for the control group, the PBC, and the PSC sample populations. Significant disequilibrium between site pairs calculated by a Fisher's exact test is shown in Figures 8 and 9. Similar patterns of linkage disequilibrium were found in all 3 populations, suggesting that in PBC and PSC patients there is no deviation from the LD pattern observed in healthy individuals. For BSEP, the majority of significant linkage disequilibrium occurred among three SNPs found in the common haplotypes BSEP_1, BSEP_4, BSEP_5, and BSEP_6 (13.1., 13.2., and 14.2.), and between three sites found in BSEP_3 and BSEP_4 (20.1., 24.1., and 28.1.) Furthermore, significant LD was observed in all three groups between the variable sites 5.2., 10.1., 10.2., 10.4., 10.6, 14.2., and 17.1..

Figure 8.

BSEP pairwise linkage disequilibrium. The plot shows pairwise linkage disequilibrium shown by a colored square. Linkage disequilibrium was assessed by a Fisher's exact test. Yellow represents 0.05 > P > 0.01, blue represents 0.01 > P > 0.001, and red represents P < 0.001. Site pairs that lack the power to test a significant association are indicated by a dot in the center of the square.

Figure 9.

MDR3 pairwise linkage disequilibrium. The plot shows pairwise linkage disequilibrium shown by a colored square. Linkage disequilibrium was assessed by a Fisher's exact test. Yellow represents 0.05 > P > 0.01, blue represents 0.01 > P > 0.001, and red represents P < 0.001. Site pairs that lack the power to test a significant association are indicated by a dot in the center of the square.

For MDR3, the majority of linkage disequilibrium was found between 5 SNPs that are present in MDR3_6 (12.3., 16.1., 16.2., 20.1., and 28.1.) and 3 SNPs found in MDR3_3 (Pro 5, Pro 6, and 4.1.).

Discussion

The present study provides an analysis of sequence variations and haplotype structures of the BSEP and MDR3 genes in healthy Caucasian controls and in patients with primary biliary cirrhosis and primary sclerosing cholangitis.

So far, hereditary dysfunction of BSEP and MDR3 was associated with two different forms of progressive familial intrahepatic cholestasis, leading to progressive cholestatic injury and liver failure in early childhood.17, 18 More than 20 different and disease-causing point mutations were detected in both genes,29 none of which was found in our sample set. In our study, no differences in variant segregation could be detected between healthy Caucasian individuals and PBC and PSC patients. About 60% of all variants were present in all three collectives and the total number of variable sites per collective as well as the number of population-specific variants was comparable in all 3 groups for BSEP and MDR3. On the other hand, there was some variation in the allele frequency and the segregation pattern of rare variants between the 3 populations. While a functional impact of some of these variants cannot be ruled out, it is unlikely that they serve as a common denominator of disease pathogenesis in PBC and PSC.

The present study provides an analysis of genetic variation and haplotype structure of BSEP and MDR3 in a large healthy Caucasian population covering promoter, coding region, and significant portions of the flanking intronic region of both genes. Except for a first description of genetic variation in BSEP and MDR3 in a Japanese population, mutations in both genes have mostly been described in the context of disease pathogenesis in patients with hereditary or acquired dysfunction of these transporters.30 Hence, our analysis is important for several reasons. First, interpretation of possibly disease-causing mutations is more meaningful if it can be compared to the pattern of genetic variation present in the normal population, because such disease-causing mutations should be absent in the control group. Second, the description of BSEP and MDR3 sequence diversity in a Caucasian collective offers a basis for future studies comparing interethnic differences in variant segregation and haplotype structure of these two genes. Finally, a phenotypic analysis of polymorphisms and haplotypes will allow the detection of variants and haplotypes with decreased function in the normal population. This might lead to the determination of BSEP and MDR3 genotypes associated with an increased susceptibility to develop cholestasis under certain challenges, such as, for example, drug-induced cholestasis or intrahepatic cholestasis of pregnancy.19, 31

In our study, overall haplotype structure and linkage disequilibrium in PBC and PSC patients did not differ from that observed in healthy controls. Furthermore, the number of population-specific haplotypes accounted only for a minority of alleles and did not differ between the three groups. Haplotype analysis was performed for two reasons. First, it has become evident that the analysis of haplotypes is more powerful than single genetic variants in predicting clinical response or disease susceptibility.32 This is especially relevant in cases where no single disease-causing mutation is identified and more subtle changes caused by different mutations will determine the phenotype. Second, because of the phenomenon of linkage disequilibrium, the allele at an observed site may track that of an unobserved site and can hence serve as a surrogate marker for a causative mutation that might have been missed in our PBC and PSC collective.32 In both cases, we would expect a deviation in the haplotype pattern between the PBC or PSC collectives from that of healthy controls.

Only two haplotypes were observed more frequently in our patient cohort as compared to healthy controls: MDR3_1 was more frequently encountered in the PSC collective and BSEP_10 more often in PBC patients. Although the power of these haplotypes to predict the corresponding diseases is relatively small, an association between these two haplotypes and disease pathogenesis cannot be ruled out, as for multifactorial and complex traits like PBC and PSC the effect of a specific haplotype is expected to be small. Furthermore, analysis of prognostic parameters for PBC and PSC for an association with certain haplotypes revealed an association between higher risk scores and BSEP_4 in the PBC group, which might point toward a pathogenic role of this BSEP haplotype in PBC subphenotypes. This might indicate that certain haplotypes could play a role in pathogenesis of PBC and PSC or might be associated with subphenotypes of these diseases. Furthermore, it cannot be ruled out that some of the rare and population-specific haplotypes might have an impact on BSEP or MDR3 function, although no strong pattern of disease-associated haplotypes could be observed in the PBC and the PSC collectives. Furthermore, it has to be emphasized that only BSEP and MDR3 were tested for risk association, while the study did not address higher-order risks, i.e., gene × gene effects between loci or gene × environment interaction.

In summary, our data provide a basis for the study of phenotypic differences of BSEP and MDR3 genetic variation in the normal population. Furthermore, the results do not support a strong pathogenic role of BSEP and MDR3 genetic variation in the pathogenesis of PBC and PSC. Further studies are currently under way to investigate a possible association of BSEP and MDR3 genetic variation with an increased risk to develop acquired cholestatic liver disease, such as drug-induced cholestasis and cholestasis of pregnancy, as well as to delineate their phenotypic consequences for BSEP and MDR3 transport function.

Ancillary

Advertisement