Usefulness of serum IgG4 in the diagnosis and follow up of autoimmune pancreatitis: A systematic literature review and meta-analysis

Authors


Dr Raffaele Pezzilli, Department of Digestive Diseases and Internal Medicine, Sant'Orsola-Malpighi Hospital, Via Massarenti, 9, 40138 Bologna, Italy. Email: raffaele.pezzilli@aosp.bo.it

Abstract

High circulating serum immunoglobulin G4 (IgG4) levels have been proposed as a marker of autoimmune pancreatitis (AIP). The aim of the present study was to review the data existing in the English literature on the usefulness of the IgG4 serum levels in the diagnosis and follow up of patients with AIP. A total of 159 patients with AIP and 1099 controls were described in seven selected papers reporting the usefulness of serum IgG4 in diagnosing AIP. In total, 304 controls had pancreatic cancer, 96 had autoimmune diseases, and the remaining 699 had other conditions. The summary receiver–operating characteristic curve analysis was carried out by means of Meta-DiSc open-access software. Serum IgG4 showed good accuracy in distinguishing between AIP and the overall controls, pancreatic cancer and other autoimmune diseases (area under the curve [± SE]: 0.920 ± 0.073, 0.914 ± 0.191, and 0.949 ± 0.024, respectively). The studies analyzed showed significantly heterogeneous specificity values in each of the three analyses performed. The analysis of the four studies comparing AIP and pancreatic cancers also showed significantly heterogeneous values of sensitivities and odds ratios. Regarding the usefulness of IgG4 as a marker of efficacy of steroid treatment, a decrease in the serum concentrations of IgG4 was found in the four available studies. The serum IgG4 subclass is a good marker of AIP, and its determination should be included in the diagnostic workup of this disease. However, the heterogeneity of the studies published until now means that more studies are necessary in order to better evaluate the true accuracy of IgG4 in discriminating AIP versus other autoimmune diseases.

Introduction

In the past few years, pancreatitis due to autoimmunity has not only been reported in Japan, but in the last decade, the frequency of new diagnoses has increased worldwide.1,2 An autoimmune pathogenesis for this disease has been proposed, because this condition is occasionally associated with antibodies or other autoimmune-associated diseases.3–9 Autoimmune pancreatitis (AIP) is characterized by diffuse or focal pancreatic swelling with narrowing of the pancreatic duct and/or common bile duct, and the histological hallmark is lymphoplasmocytic infiltration, which is particularly concentrated in the pancreatic ducts.10,11 Treatment with corticosteroids is often effective, but an operation is often carried out as a consequence of a misdiagnosis of pancreatic cancer.12 Thus, differentiation between AIP and pancreatic cancer remains a diagnostic challenge that needs to be resolved. It has been proposed that the elevation of serum immunoglobulin G4 (IgG4) may help in diagnosing AIP.13 The aim of this paper was to review the data existing in the English literature on the usefulness of IgG4 serum levels in the diagnosis and in the follow up of patients with AIP. We also evaluated the clinical utility of circulating IgG4 in the differential diagnosis between AIP and both pancreatic cancer and non-pancreatic autoimmune diseases.

Literature search methods

A search was made on 14 September 2007 using four different databases (MEDLINE, Web of Science [WoS], Scirus, and Scopus) in order to select the data existing in the literature on IgG4 serum concentrations in pancreatic diseases. The ‘controlled terms’ used by each database for the entire period of years covered by each database were searched: the term ‘IGG4’ and the Medical Subject Headings term ‘pancreatic diseases’ were used for searching MEDLINE; topic terms were used for searching the WoS, and key words were used for searching both Scirus and Scopus. Only papers published in full text were selected. A total of 693 citations were found, including 489 citations on MEDLINE, 88 on WoS, 56 on Scirus, and 60 in Scopus. In total, 550 papers were identified after deduplication: 455 papers were present in only one database, 50 papers in two databases, 42 papers in three databases, and three papers were present in all of the databases. Out of the 455 papers present in one database only, 399 were present in MEDLINE, 15 in WoS, 34 in Scirus, and seven in Scopus. All of the 550 articles found were evaluated in order to select those reporting serum IgG4 concentrations in patients with AIP compared to one or more control groups. In addition, those papers reporting IgG4serum levels in the follow up of patients with AIP were also identified. Figure 1 shows the flowchart for the selection of articles: 190 of the 550 papers were excluded because they were not original articles (83 were case reports; 80 were review articles; 19 were letters to the editor, not containing original data; six were editorials; and two were guideline articles). Of the remaining 360 papers, 307 were excluded because they contained data regarding diseases other than AIP. Of the 53 remaining papers, 26 were excluded because they did not report data on IgG4 in AIP. In addition, 14 papers were excluded because data were only provided on IgG4 in patients with AIP without a control group or follow-up study. Finally, three papers were also excluded for lack of individual data. In addition, three papers were also excluded for lack of individual data. The unselected articles are listed in Appendix I. Thus, 10 papers were considered: six reporting the usefulness of serum IgG4 in diagnosing AIP,14–19 one also reporting the usefulness of IgG4 in the follow up of AIP,20 and three reporting the usefulness of IgG4 in the follow up of AIP only.21–23

Figure 1.

Flowchart showing the selection of the papers evaluating immunoglobulin G4 (IgG4) in patients with autoimmune pancreatitis (AIP). Search carried out on 14 September 2007.

Methods

IgG4 in diagnosing AIP

Patients

A total of 159 patients with AIP were described in the seven selected papers reporting the usefulness of serum IgG4 in diagnosing AIP (Table 1). The diagnostic criteria used for the diagnosis were the Japanese criteria in four studies,16,18–20 the Spanish score in one,17 the Korean criteria in one,15 and the Mayo Clinic criteria in the remaining paper.14 Histology was clearly indicated in all of the studies, except one,15 and was compatible with AIP in 98 of the 129 patients (76%) enrolled in these six studies. The techniques used for IgG4 determination were the nephelometric assay in four studies14,16–18 and the radial immunoassay in three.15,19,20 The upper reference limit of IgG4 was 135 mg/dL in five studies,15,16,18–20 130 mg/dL in one study,17 and 140 mg/dL in the remaining study.14

Table 1.  Details of the seven studies included in our analysis, which report the usefulness of serum immunoglobulin G4 in diagnosing autoimmune pancreatitis (AIP)
StudyAssay utilizedURL (mg/dL)Criteria for AIPPatientsControlsTotal no. controls
nHistology positive for AIPPancreatic cancersAutoimmune diseasesOther diseases
  • Only the 129 cases of the six studies with available data were taken into account. NR, not reported; URL, upper reference limit.

Ghazale et al.14Nephelometry140Mayo Clinic criteria4525 (55.6%)135330465
Choi et al.15Radial immunodiffusion135Korean criteria30NR7667143
Hirano et al.16Nephelometry135Japanese criteria3524 (68.6%)23114882
Hamano et al.20Radial immunodiffusion135Japanese criteria2020 (100%)703664170
Aparisi et al.17Nephelometry130Spanish score66 (100%)33144177
Uehara et al.18Nephelometry135Japanese criteria66 (100%)66
Kamisawa et al.19Radial immunodiffusion135Japanese criteria1717 (100%)104656
Total15998/129 (76.0%)304966991099

Controls

As shown in Table 1, a total of 1099 patients were studied as controls in these seven papers: 304 patients had pancreatic cancer; 96 had autoimmune diseases (20 with primary biliary cirrhosis, 25 with primary sclerosing cholangitis, and 51 with Sjögren's syndrome), and the remaining 699 had other conditions (136 with no evident diseases; 21 with sialolithiasis; five with sclerosing sialadenitis; 58 with acute pancreatitis; 325 patients with chronic pancreatitis, mainly due to alcoholic etiology; 67 with other benign, unspecified diseases, 64 with benign pancreatic tumors; three with endocrine tumors, two with neoplasia of the Vater papilla, 15 with bile duct cancer, and three with gallbladder cancer).

Data analysis

Three different summary receiver–operating characteristic (SROC) curve analyses were performed considering different groups of controls. In the first analysis (analysis A), the AIP patients were compared with all the 1099 available controls pooled together (seven studies). In addition, two subanalyses were performed by considering specific controls: one analysis took into account the four studies that compared the AIP patients with the 304 patients with pancreatic cancers (analysis B), and the other analysis took into account the five studies that compared the AIP patients with the 96 patients presenting with autoimmune diseases without pancreatitis (analysis C).

IgG4 in monitoring AIP

Regarding the usefulness of IgG4 as a marker of efficacy of steroid treatment in the follow up of AIP patients, four studies were taken into account20–23 for a total of 34 patients. One study20 reported the data as median and fifth and 95th percentiles, another one reported the data as median and 25th and 75th percentiles,23 and the remaining two papers21,22 reported the individual IgG4 data allowing us to compute the percentile values.

SROC

The SROC analysis was carried out using Meta-DiSc (version 1.4; Unit of Biostatistics, University Hospital Ramón y Cajal, Madrid, Spain; http://www.hrc.es/investigacion/metadisc_en.htm) open-access software.24 The SROC asymmetrical curves were fitted by using the Moses–Shapiro–Littenberg method.25,26 The equally-weighted model estimation was adopted according to the recommendations of Irwig et al.27 and Moses et al.26 Adding 0.5 to all of the cells of all of the studies was chosen as the option for handling the studies with empty cells.25,26,28 Diagnostic odds ratios (OR) were pooled by the DerSimonian–Laird method (random effects model)29 in order to incorporate variations among the studies, as including random effects is reported to be the most realistic and appropriate model to adopt in practice.30,31 The areas under the SROC curve (AUC), the ‘a’ values (i.e. the value of the natural logarithm (odds ratio) and the ‘b’ values (i.e. the dependence of the test accuracy on threshold) were also estimated, together with their standard errors (SE).25 SPSS for Windows (version 13.0; SPSS, Chicago, IL, USA) was used in order to compare the AUC estimates and to compute the percentile values of the two studies reporting individual IgG4 data in monitoring AIP follow up.

Results

IgG4 in diagnosing AIP

Table 2 shows the data from the literature subdivided according to the three analyses carried out: the overall analysis of the seven studies (analysis A), the analysis of the four studies with data available for the comparison of AIP with pancreatic cancer (analysis B), and the analysis of the five studies with data available for the comparison of AIP with other non-pancreatic autoimmune diseases (analysis C). The forest plots of these data are shown in Figure 2. The sensitivities ranged from 66.717 to 100%.18 However, only a limited range of specificity values actually occurred, and consequently, the plotted SROC curve extended beyond the empirical range of the data. In fact, all of the studies, with the exception of one, reported good specificities (values equal to or greater than 89.3%). A low specificity value of 63.6% was reported by Hirano et al.16 in distinguishing AIP from other non-pancreatic autoimmune diseases (analysis C). It should be noted that all the other specificity values of this analysis were equal to 100%. Finally, wide ranges of diagnostic OR were detected (22.3–2523). A comparison of the different analyses studied showed that highly heterogeneous specificity values were found in each of the three analyses. In addition, the comparison of AIP patients with pancreatic cancer patients (analysis B) showed particularly heterogeneous values of both sensitivities and diagnostic OR.

Table 2.  Data in the literature comparing immunoglobulin G4 serum concentration between the autoimmune pancreatitis (AIP) and control populations. The 95% confidence intervals of sensitivities, specificities, and diagnostic odds ratios (OR) are also reported within parentheses
StudyTPFPFNTNSensitivity (%)Specificity (%)Diagnostic OR
  1. Plots are subdivided according to the three analyses performed. Analysis A, overall analysis of the seven studies identified; analysis B, analysis of the four studies with data available for the comparison of AIP with pancreatic cancer; analysis C, analysis of the five studies with data available for the comparison of AIP with the other non-pancreatic autoimmune diseases. P-values: χ2- test for homogeneity of sensitivities, specificities, and diagnostic OR among the different studies. FN, false negative cases; FP, false positive cases; TN, true negative cases; TP, true positive cases.

Analysis A: AIP patients vs all available controls
Ghazale et al.1434321143375.6 (60.5–87.1)93.1 (90.4–95.2)40.0 (18.8–85.3)
Choi et al.15229813473.3 (54.1–87.7)93.7 (88.4–97.1)37.5 (13.4–104.7)
Hirano et al.1633427894.3 (80.8–99.3)95.1 (88.0–98.7)233.8 (47.3–1156)
Hamano et al.20180217090.0 (68.3–98.8)100 (97.9–100)2523 (116.7–54 579)
Aparisi et al.1748216966.7 (22.3–95.7)95.5 (91.3–98.0)35.9 (6.6–195.0)
Uehara et al.186006100 (54.1–100)100 (54.1–100)169.0 (2.9–9876)
Kamisawa et al.1914635082.4 (56.6–96.2)89.3 (78.1–96.0)32.2 (7.7–133.8)
Total1315928104082.4 (75.6–88.0)
P = 0.078
94.6 (93.1–95.9)
P < 0.001
63.9 (28.9–141.4)
P = 0.071
Analysis B: AIP patients vs patients with pancreatic cancers
Ghazale et al.1434131112275.6 (60.5–87.1)90.4 (84.1–94.8)27.2 (11.4–65.1)
Choi et al.1522187573.3 (54.1–87.7)98.7 (92.9–100)133.2 (22.1–804.8)
Hirano et al.1633022394.3 (80.8–99.3)100 (85.2–100)629.8 (28.9–13 729)
Hamano et al.2018027090.0 (68.3–98.8)100 (94.9–100)1043 (48.0–22 685)
Total107142329082.3 (74.6–88.4)
P = 0.043
95.4 (92.4–97.5)
P = 0.001
144.6 (24.l4–857.6)
P = 0.021
Analysis C: AIP patients vs patients presenting autoimmune diseases without pancreatitis
Hirano et al.163342794.3 (80.8–99.3)63.6 (30.8–89.1)22.3 (3.9–122.9)
Hamano et al.2018023690.0 (68.3–98.8)100 (90.3–100)540.2 (24.6–11 842)
Aparisi et al.174023366.7 (22.3–95.7)100 (89.4–100)120.6 (5.0–2935)
Uehara et al.186006100 (54.1–100)100 (54.1–100)169.0 (2.9–9876)
Kamisawa et al.1914031082.4 (56.6–96.2)100 (69.2–100)87.0 (4.0–1870)
Total75499289.3 (80.6–95.0)95.8 (89.7–98.9)66.6 (20.2–220.0)
P = 0.250P = 0.001P = 0.444
Figure 2.

Forest plots showing sensitivities, specificities, and diagnostic odds ratios (OR) of the data in the literature comparing immunoglobulin G4 serum concentrations between the autoimmune pancreatitis and control populations. Points in plots are proportional to the size of the studies, and the diamonds show the values estimated by the summary receiver–operating characteristic curve analysis. Dashed lines represent the 95% confidence intervals of the estimates. Plots are subdivided according to the three analyses performed. (a) Overall analysis of the seven studies identified. (b) Analysis of the four studies with data available for the comparison of autoimmune pancreatitis (AIP) with pancreatic cancer. (c) Analysis of the five studies with data available for the comparison of AIP with other non-pancreatic autoimmune diseases.

The three SROC curve obtained by transferring the data shown in Table 2 are plotted in Figure 3, and the various parameter estimates are shown in Table 3. Low correlations were observed among the data reported in the different studies (P-values equal to or greater than 0.188) for all three analyses carried. In particular, the SROC curve of the comparison of AIP with pancreatic cancer had a poor correlation showing a particularly wide 95% confidence interval. The ‘a’ values ranged from 3.92 to 4.39, which corresponded to high diagnostic OR values (ranging from 63.9 to 144.6); therefore, the curves were closer to their ideal position near the upper left corner. The ‘b’ values were quite different from zero, even though all three ‘b’ estimates did not reach significance. Therefore, the hypothesis of independence of the test accuracy at the threshold (i.e. the homogeneity of the studies with respect to OR) could not be ignored (P-values greater than or equal to 0.233); the most heterogeneous case with respect to the OR was the analysis of AIP versus pancreatic cancer (b = −0.67 ± 1.04). The AUC values showed good accuracy of IgG4 in distinguishing AIP from the overall available controls (analysis A: AUC = 0.920 ± 0.073), as did the two subanalyses carried out by comparing AIP with pancreatic cancer (analysis B: AUC = 0.914 ± 0.191) and with other autoimmune diseases (analysis C: AUC = 0.949 ± 0.024), respectively. No significant difference was found among these three AUC estimates (F = 0.025, P = 0.976).

Figure 3.

Unweighted summary receiver–operating characteristic curves (estimates and 95% confidence intervals). Plots are subdivided according to the three analyses performed. Points in plots are proportional to the size of the studies. (a) Overall analysis of the seven studies identified. (b) Analysis of the four studies with data available for the comparison of autoimmune pancreatitis (AIP) with pancreatic cancer. (c) Analysis of the five studies with data available for the comparison of AIP with other non-pancreatic autoimmune diseases.

Table 3.  Parameter estimates obtained by the fitting procedure applied to the summary receiver–operator characteristic asymmetrical curves comparing immunoglobulin G4 serum concentrations between autoimmune pancreatitis (AIP) and the three different control populations
 Analysis A:
AIP patients vs all available controls
(7 studies)
Analysis B:
AIP patients vs patients with pancreatic cancer
(4 studies)
Analysis C:
AIP patients vs patients with non-pancreatic autoimmune diseases
(5 studies)
  1. AUC, area under the summary receiver–operator characteristic curve; t-value, Student's t-test.

Regression (Spearman)
• r-value0.1070.0000.700
• P-value0.8191.0000.188
a-values
Mean ± SE3.92 ± 0.934.02 ± 2.334.39 ± 0.52
t-value4.221.728.51
P-value0.0080.2270.003
b-values
Mean ± SE−0.52 ± 0.48−0.67 ± 1.04−0.34 ± 0.23
t-value1.070.641.49
P-value0.3330.5880.233
AUC
Mean ± SE0.920 ± 0.0730.914 ± 0.1910.949 ± 0.024

IgG4 in monitoring AIP

Regarding the usefulness of IgG4 as a marker of the efficacy of steroid treatment, we found that all four studies considered showed a decrease of circulating IgG4 concentrations from the basal observation to the observation made after 4 weeks of steroid treatment (Fig. 4). This decrease in the serum concentration of IgG4 was found to be significant after steroid treatment in the studies of Hamano et al. (P = 0.002)20 and Umemura et al. (P = 0.016),23 whereas significance was not reported in the studies of Kamisawa et al.21 and Nishino et al.22

Figure 4.

Box and whiskers plot of the four studies reporting the usefulness of serum immunoglobulin G4 (IgG4) in monitoring the follow up of autoimmune pancreatitis (AIP) patients. IgG4 concentrations before (B) and after 4 weeks (4 w) of steroid treatment are reported. Boxes represent the interquartile ranges and the lines emanating from each box (the whiskers) extend to the fifth and the 95th percentiles. Median values are reported in the boxes.

Discussion

The diagnosis of AIP represents a clinical challenge; in fact, in recent years, several guidelines have been released for the correct diagnostic approach to this disease.32 In the latest guidelines released in Japan in 2006,33 the authors claimed that the diagnosis of AIP is established when the following criteria have been fulfilled: the presence of diffuse or segmental narrowing of the main pancreatic duct with an irregular wall, and diffuse or localized enlargement of the pancreas by imaging studies, such as abdominal ultrasonography, computed tomography, and magnetic resonance imaging associated with high serum γ-globulin or high circulating IgG4 levels, or the presence of autoantibodies (antinuclear antibodies and rheumatoid factor), and/or marked interlobular fibrosis and prominent infiltration of lymphocytes and plasma cells in the periductal area, occasionally with lymphoid follicles in the pancreas. On the contrary, the Italian guidelines do not consider IgG4 as one of the diagnostic criteria for assessing the diagnosis of AIP.34 The IgG4 is the rarest of IgG subclasses and it accounts for only 3–6% of total IgG in the serum of human patients. The IGg4 subclass is unique among the IgG subclasses in its inability to bind the C1q complement protein, and thus, it activates the complement pathway.35 High serum IgG4 concentrations have also been found in some other pathological conditions, such as atopic dermatitis,36 parasitic disease,37 pemphigus vulgaris, and foliaceus.38 Thus, IgG4 does not seem to be a specific marker for the serological diagnosis of AIP. Therefore, our aim was to review the data existing in the English literature in order to better elucidate the role of IgG4 in diagnosing AIP.

For the purpose of this study, we used a meta-analytic approach specifically designed for grouping and analyzing data of diagnostic studies, namely the SROC method. While a plot between true positive rate and fale positive rate at various thresholds (namely, the receiver–operator characteristic curve plot) is commonly used in presenting the report of a single study, the SROC approach gives a good overview of pooled results of several studies. In fact, reports on diagnostic tests may show large discrepancies among the various studies, and this situation is also observed for IgG4 as a marker for diagnosing AIP. After an extensive evaluation of the English literature, we identified seven papers on the usefulness of serum IgG4 in diagnosing AIP.14–20 These seven studies involved 159 patients with AIP and 1099 controls. The heterogeneity of the control patients studied, including patients with pancreatic cancer, AIP diseases, and several other pathological conditions, as well as patients with no clinically detected diseases should be pointed out. In the seven studies examined, a non-homogeneous selection of patients was seen, because the criteria used for the diagnosis of AIP were different: Japanese criteria were applied in four studies,16,18–20 the Spanish score,17 the Korean criteria,15 and the Mayo Clinic14 criteria in one each of the remaining three studies. Another critical aspect of these studies is that the histology was compatible with AIP only in 76% of the 129 patients enrolled in six of these studies, and the histology was not reported in the 30 patients of the remaining study.15 The techniques used for IgG4 determination were also quite different: the nephelometric assay was used in four studies14,16–18 and the radial immunoassay was used in three studies.15,19,20 On the contrary, the upper reference limits of IgG4 were quite similar, ranging from 130 mg/dL to140 mg/dL.

The overall analysis of these studies shows that the sensitivities ranged from 66.7%17 to 94.3%.16 However, only a limited range of specificity values actually occurred, and we should be aware that the plotted SROC curve extend beyond the empirical range of the data analyzed. Notwithstanding the limited range of these values, a comparison among the different studies showed that significantly heterogeneous specificity values were found. We are not able to explain this elevated heterogeneity; it is possible that both the different criteria used for assessing the diagnosis of AIP and the heterogeneity of the controls used may play an important role. The heterogeneity of the published data is the major concern of the vast majority of the meta-analytic approaches; but in our study, we systematically examined all of the data on IgG4 present in the literature, and the meta-analysis allows us to better quantify and represent this heterogeneity. Thus, the diagnostic value of IgG4 is quite high, even if the elevated heterogeneity found among the studies suggests that more studies are needed in order to assess the true accuracy of IgG4 in the diagnosis of AIP in clinical practice. In particular, these new studies should be specifically designed in order to allow the identification of the possible confounding factors, which might account for the heterogeneities present in the previous studies.

Another important tool in the diagnosis of AIP is that most surgical interventions in AIP patients are carried out as a consequence of a misdiagnosis of pancreatic cancer; thus, differentiation between AIP and pancreatic cancer is an additional diagnostic challenge that needs to be resolved in the near future. In this respect, only four studies are available for comparing AIP with pancreatic cancer. This comparison was the most critical one of our three analyses. In fact, the r-value showed poor regression among the four studies, and consequently, the estimates of the parameters of the relationship (i.e. the ‘a’ and ‘b’ values), as well as the estimate of the AUC, were also found to be poor, demonstrating wide SE values. Sensitivities, specificities, and OR were also significantly heterogeneous among the four studies evaluated in this analysis. Despite all of these heterogeneities, the overall set of the data of these four studies are near the ideal point of the ROC space, and the AUC value in comparing AIP patients with pancreatic cancer patients showed good accuracy of IgG4 in distinguishing between these two conditions. This finding suggests that the determination of serum IgG4 may be useful in confirming the diagnosis of AIP avoiding unnecessary surgical procedures for suspected pancreatic cancer. Also in this case, additional larger studies are necessary to confirm the results of our analysis.

Finally, we also evaluated the diagnostic value of IgG4 in differentiating AIP from other autoimmune diseases. We also found an elevated heterogeneity of the specificity values among the five studies evaluated. On the contrary, a moderate relationship was found in this analysis among the data of the five studies, and a good value of the AUC was detected. Thus, this analysis shows that IgG4 is useful in this respect, that is, in differentiating AIP from other autoimmune diseases of non-pancreatic origin. We have no explanation as to why this happens, but further studies are needed because the reasons for the results we obtained are surprising.

Treatment with corticosteroids is often effective in curing AIP,34 thus, a confirmed diagnosis of AIP is necessary before beginning steroid treatment. The four studies considered20–23 involved a low number of patients with AIP (n = 34). The only conclusion we can draw is that IgG4 may be a useful marker for evaluating the success of steroid treatment, but these conclusions seem to be true only for those patients with elevated serum levels of IgG4 before starting the steroid treatment.

In conclusion, the heterogeneity of the studies published until now means that more studies are necessary in order to assess the true accuracy of IgG4 in AIP. We also need additional studies in order to evaluate the best method of determining serum IgG4 and establishing the best cut-off value for serum IgG4 in order to reach a diagnosis of AIP.

Appendix

Appendix I

List of unselected references

  • 1Eighty-three case reports39–121
  • 2Eighty review articles13,122–200
  • 3Nineteen letters to the editor not reporting original data201–219
  • 4Six editorials220–225
  • 5Two guidelines34,226
  • 6Three-hundred-and-seven papers were excluded because they contained data regarding diseases other than autoimmune pancreatitis227–533
  • 7Twenty-six papers were excluded because they did not report data on immunoglobulin G4534–559
  • 8Fourteen papers were excluded because they reported data on immunoglobulin G4 only in patients with autoimmune pancreatitis without a control group or follow-up study560–573
  • 9Three were excluded due to lack of individual data574–576

Ancillary