Meta-analysis: the diagnostic accuracy of lactose breath hydrogen or lactose tolerance tests for predicting the North European lactase polymorphism C/T-13910


Dr A. Szilagyi, Jewish General Hospital, 3755 Cote Ste Catherine Rd, Room G 327, Montreal, QC, Canada, H3T 1E2.


Aliment Pharmacol Ther 2012; 35: 429–440


Background  The diagnostic accuracy of two indirect tests of lactose digestion, lactose breath hydrogen and lactose tolerance tests, have not been systematically reviewed for comparison with available publications on genotype.

Aim  To perform a meta-analysis of available studies that compares the north-European genetic polymorphism C/T-13910 with the lactose breath hydrogen and the lactose tolerance tests, to determine their ability to predict geno/phenotype relationships. We examine the effects of lactose loading dose, inclusion of children and latitudes of study centre on comparative outcome.

Methods  An electronic database of the literature as well as individual references in articles were searched with the theme of genetics of lactase and comparisons with breath or lactose tolerance tests were carried out. Random effect and fixed effect models were used for breath and lactose tolerance tests respectively, to report summary accuracy measures with 95% confidence intervals (CI).

Results  The search revealed 19 studies: 17 evaluated breath hydrogen, five lactose tolerance test (3/17 overlapped). Overall sensitivity was 0.88 (CI, 0.85–0.90), specificity was 0.85 (CI, 0.82–0.87) for breath test. Heterogeneity was explored by adjusting for studies including children, high or low dose lactose and to some extent by site of study. The lactose tolerance test showed sensitivity of 0.94 (0.9–0.97) and specificity of 0.90 (0.84–0.95) with a nonsignificant heterogeneity.

Conclusion  The diagnostic accuracy of both tests individually reflects expected geno/phenotypes when the populations are well defined.


After the description of lactose intolerance, it was discovered that adult lactose maldigestion is a recessive (lactase nonpersistence, LNP), whereas the ability to digest lactose is a dominant genetic trait (lactase persistence, LP).1, 2 Distributions depend on ethnic and geographic factors, such that the majority (about 65%) is LNP.3 LNP status generally increases towards the equator and eastward. The new worlds (North and South America, Australia) are variably distributed due to historical population migrations.

Two features of lactase dichotomy have interested workers in the field. In medicine and nutrition, the clinical contribution of lactose maldigestion to the development of gastrointestinal symptoms fuelled research. As a result, to alleviate symptoms, there is importance attributed to alter dietary use of dairy foods.4 The other aspect is medical epidemiology, and partly relates the global distribution of LP/LNP with risks of several (possibly many) diseases.5

Research interests to date relied mainly on indirect tests of lactose digestion. These tests were generally correlated with intestinal biopsy measurements of lactase,6, 7 but for clinical or epidemiological studies this method is not practical. Based on previous comparisons of indirect tests, current methodology employs mainly the lactose breath hydrogen test (LBHT) and secondly the lactose tolerance tests (LTT).8, 9 The latter measures the presence of intestinal lactase and the ability of the person to split lactose into glucose and galactose. A positive rise of glucose (>1.1–1.4 g/L within 2 h) is consistent with LP status.9 The breath test measures bacterial metabolism of lactose, which spills into the lower intestine when inadequately metabolised. Hydrogen produced by bacterial metabolism enters the lung, is exhaled and is measured in the breath.10 A rise of hydrogen (≥20 parts per million, ppm), is consistent with LNP status. There is only a moderate agreement between these two indirect tests.8, 11, 12

The genetics of lactase has interested different groups, and knowledge has advanced considerably. Control of this gene on chromosome 2q21 has been identified with several haplotypes. The main site is in intron 13 of the minichromosome maintenance 6 (MCM 6) gene (reviewed in 2). The first gene to control transcription of lactase, the C/T-13910 was identified by Enattah et al. and is noted largely in north Europeans and their descendants.13 This polymorphism was correlated with intestinal biopsies.14 A second polymorphism G/A-22018 has also been described but in Europeans may not have independent effects. It, however, may be the main polymorphism controlling lactase in northern Chinese.15, 16 In north Europeans, C/C genotypes are homozygous maldigesters with less than 10 U/g intestinal lactase (LNP), T/T genotypes are normal digesters with normal intestinal lactase levels (LP), and C/T genotypes are normal digesters with intermediate intestinal enzyme levels (LP).

Other polymorphisms related to the European type have been identified in east Africa and the Middle East.17, 18 The presence of these or other single nuclear polymorphisms could interfere with currently used polymerase chain reaction (PCR)-based kits measuring the C/T-13910 polymorphism. With increasing variety of LP polymorphisms possibly contributing to populations towards the south of Europe, it was suggested that current kits be used with caution for genetic identification in non-Northern Europeans because diagnostic precision may be lost (loss of sensitivity and specificity).19

The objectives of this systematic review were to evaluate studies comparing the LBHT or LTT with north-European genetic polymorphisms (currently the only commercially available test kits) for accuracy. Possible confounders that might reduce test accuracy are evaluated. The diagnostic precision of the G/A polymorphism is also evaluated from available literature.


Search strategy

Computerised medical literature search was initiated using OVID Medline/Pub Med and Cochrane database. The following terms were used: hydrogen breath test OR lactose intolerance OR lactose breath hydrogen test OR lactose (in) tolerance test OR indirect tests for lactose digestion AND genetics of lactase OR lactase polymorphisms OR C/T-13 910 polymorphism for lactase. Moreover, relevant articles were scrutinised from other individual articles found.

Selection criteria

Inclusion criteria for systematic review included: English publications, full articles or abstracts with adequate available information on either LBHT, LTT or both with results of outcome for genotype for C/T-13910 or G/A-22018. Studies searching for other polymorphisms that describe LP status, particularly in African or Middle East populations were excluded. Articles were reviewed independently by two of the authors (AM and AS).

Data extraction

For each study, the following information was extracted: site of test centre, setting, number of participants, gender distribution, inclusion of children (≤17 years) and indication for tests were noted. The characteristics of lactose tests were also recorded. These included: dose of lactose, criteria for test positivity and outcome of test results. Similarly characteristics of genetic tests were recorded including: method of genetic tests, outcome of genetic tests (sensitivity and specificity of the indirect tests for the genetic tests) and comparison of genetic test with indirect tests. Symptom outcomes were not evaluated.

Possible impact of latitude on test outcome was evaluated using Google search engine to identify city test centre latitudes. This assessment was only carried out for LBHT because there were too few centres where LTT was also carried out.

Quality assessment

The quality of studies was individually assessed by two authors (AM and AS) using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS).20 This tool uses a 14-point questionnaire each requiring a yes, no or unclear answer. However, a quality score may be obtained by summing the answers and assigning a score as has been used before.21 In this scheme, the minimal numerical value is 1 and maximum is 14. A high-quality study is defined by a score of ≥10, a moderate-quality study by a score of 6–9 and a poor quality study by a score of ≤5. This assessment allows comparisons of published studies using a standard set of criteria. When there was a disagreement in scores, the contentious articles were reviewed and discussed point by point and a consensus was reached. In addition, the analysis was also assessed by using PRISMA, which is a standard protocol designed to improve systematic reviews and applicable to evaluation of such interventional tests.22, 23 It does not allow quality assessment of individual publications.

Data analysis

Presentation of data.  Three types of meta-analyses are conducted: LBHT for genotype C/T, LBHT for genotype G/A, and LTT for genotype C/T. For each analysis, the sensitivity, specificity, positive and negative likelihood ratio measures of test performance were calculated for each study.24, 25 Where required, a zero cell correction of 0.5 was added to all cells of studies with zero cells. Summary sensitivities, specificities and diagnostic odds ratios (DOR) with their 95% confidence intervals (95% CIs) were obtained using random effect models with DerSimonian Laird methods25, 26 or fixed effect models depending on the level of heterogeneity of the study group. DOR, which is a unitary measure of diagnostic performance that encompasses both sensitivity and specificity, was calculated by dividing (sensitivity/1 − sensitivity) by (1 − specificity/specificity). Forest plots of sensitivities, specificities and DORs were presented. Summary receiver operating characteristic (sROC) curve (which is based on a relationship plot of sensitivity and 1 − specificity of each study) and the area under curve (AUC) were also constructed, with perfect test having an AUC of 1.0 and poor test having an AUC of 0.5.24, 25

Assessment of heterogeneity.  Four strategies were used to assess possible heterogeneity: Spearman correlation coefficient test, meta-regression analysis, subgroup analysis and sROC analysis.

Determination of clinical heterogeneity.  We explored clinical heterogeneity among studies due to both threshold effect and nonthreshold effects (study differences in populations, locales, intervention and expected outcomes).

Determinants of statistical heterogeneity and publication bias.  The presence of threshold effect (differences in sensitivities and specificities occurring because of different cut-offs used in different studies to define a positive test result) was assessed by Spearman correlation coefficient. For LBHT meta-analysis, since there are more studies in this group, heterogeneity due to other reasons was assessed using chi-square (χ2) test and quantified using the inconsistency index I-square (I2).27 When substantial heterogeneity was found to be present (I2 > 50%), meta-regression analysis (done with >10 studies available/covariate) was used to explore the potential reasons using DOR (in its natural logarithm) as dependent variable. Three predetermined study-level characteristics were considered as potential covariates: lactose load, inclusion of children and latitude of study location. Heterogeneity was also explored using sub-group analyses according to different lactose load (50 vs. 25 or 20 g) and according to whether the study includes children. Publication bias was evaluated using an Egger plot.28

All data were analysed using Meta-DiSc (Version 1.4)29 and SAS 9.2 (SAS Institute Inc., Cary, NC, USA) statistical software.


Retrieval of data

The initial search strategy revealed 424 publications between April 1974 and October 2010. The search results are shown in Figure 1. The majority were not applicable to the analysis based on titles or abstracts. From these, we identified 74 further restricted publications, which appeared more relevant but 48 were excluded because they did not compare genetics with the targeted indirect tests, leaving 26. Among these 21 dealt with LBHT (three examined both LBHT and LTT), three dealt with LTT exclusively, and two with potential polymorphisms other than the C/T or G/A types.12, 16, 17, 30–52

Figure 1.

 The results of search strategy are shown.

An additional seven studies were excluded (four LBHT, one LTT and two using various other indirect tests in combination). Four studies comparing LBHT alone with C/T genotype were excluded from further analysis. The study by Xu et al. was excluded because the G/A-22018 polymorphism was assessed and only historical controls were compared.16 The study by Bodlaj et al. was excluded because only a restricted sample of positive breath hydrogen cases was compared with genetic analysis.48 The study by Hovde et al. was excluded because only subjects with a discrepancy between breath hydrogen test and LTT were submitted for genetic testing.49 A study by Ingram et al. was also excluded on the grounds it was carried out in Africa where it has been shown the C/T-13910 genotype does not explain LP.50 Therefore 17 studies remained in the breath hydrogen genetic comparison. These studies combined included 1708 persons.

Among LTT restricted studies, a report by Tishkoff et al. was excluded because it used LTT to only search for other non-European genetic polymorphisms.17 Five studies remained, with a combined total of 320 persons, including a comparison of the LTT and genetic test. Three of these five studies evaluated both LBHT (included above) and LTT.12, 30, 44, 46, 47

In addition, two other studies that used one or more indirect tests, searching for other polymorphisms were excluded.51, 52 In total, 19 studies were evaluated.

Quality assessment

There was a disagreement in score (evaluation difference ≥2 and shift in quality of the study) for four studies. Table 1A, B outlines the scores given for each article; only one study was of moderate quality35 the rest were of good quality. The PRISMA check list and QUADAS guidelines are provided in the Supporting Information.

Table 1.   (A) Outlines demographic features of 17 studies that compare the breath hydrogen test (LBH) to the European Genotype for Adult Lactose Digestion status. (B) Compare the lactose tolerance test (LTT) with the same polymorphism. The test centre where the publication originated and latitude of the city of study origin are listed. The table denotes inclusion of children as yes (y) or no (n). The quality assessment score is based on QUADAS criteria20 modified to include a score22
A. Characteristics of 17 studies using breath hydrogen test for genotype C/T
YearAuthor (ref.)Test centreLatitudeChildren (y/n)Lactose load (g)%FemaleTotal (N)Quality assessment
2005Gugatschka (30)Vienna48n5005110
2005Buning (31)Berlin52y506116610
2005Hogenauer (32)Graz47n506512312
2007Bulhoes (33)Porto Allegre30n50902010
2007Schirru (34)Cagliari39n25758410
2007Bernardes-Silva (35)Sao Paolo23n2579729
2007Szilagyi (12)Montreal45n50603011
2007Kerber (36)Vienna48n507812010
2008Mattar (37)Sao Paolo23n25745010
2008Krawcyk (38)Homburg53n50575811
2008Mottes (39)Ferrara44y254311210
2008Waud (40)Cardiff51y507820012
2008DiStefano (43)Pavia45n20783210
2009Nagy (42)Szeged46y505618612
2009Szilagyi (41)Montreal45n50605711
2010Babu (44)Lucknow26n255615311
2010Pohl (45)Zurich47n507219411
B. Characteristics of five studies using LTT for genotype C/T*
YearAuthor (ref.)Test centreTotal (N)Children (y/n)Lactose load (g)% FemaleTotal (N)Quality assessment
  1. Threshold for defining a positive breath test was 20 parts per million above baseline in 16 studies. One study (39) used 20 ppm above the nadir.

  2. *Thresholds for positive rise in glucose were 1 mmol/L (30), 1.1 mmol/L (12, 43), 1.3 mmol/L (45) and 1.4 mmol/L (46).

2004Nilsson (46)Orebro35nNSNA3510
2005Gugatshcka (30)Vienna51n5005110
2005Ridefelt (47)Upsala51n50715111
2007Szilagyi (12)Montreal30n50563011
2009Babu (44)Lucknow153n256015311

Description of included studies

Of the included 19 studies, 10 used consecutive patients referred to hospital out-patient clinic for work-up for suspected symptoms of lactose intolerance. Two of these 10 specifically targeted patients with diagnosis of functional bowel disorders36, 43 and one selected men at random for comparisons.30 In the case of the other nine studies, all were volunteers and most were recruited: one study selected random volunteers from hospital out-patient clinics,34 seven studies included symptomatic volunteers, one specifically with irritable bowel syndrome35 and two included volunteers with only self-reported lactose intolerance.12, 41 All except the study of Nagy et al. excluded secondary causes of lactose maldigestion. In this study, 12% (n = 22) of the total included had other diseases.42 Demographics of studies are outlined in Table 1A, B.

The average age of the entire cohort of 19 studies was 35.7 with a range of 2–77 years. Four studies included children31, 39, 40, 42 but only two specified the number of children.39, 42 In these two studies, the mean age was 11 for participants. Overall, there was an average of 64.4% females in the studies. The study by Gugatschka et al. studied only men.30 The LBHT or LTT tests were carried out with dosage unstated (one study), 20 g (one study), 25 g (six studies) or 50 g (11 studies). The average study ran for 205.1 min (range 120–420 min.). One study by DiStefano et al.43 used the sum of the fifth, sixth and seventh hour as the target period. Threshold values for a positive cutoff with the LBHT was mostly uniform at 20 parts per million (ppm) above the baseline value. One study used 20 ppm above the nadir.40 The LTT studies were uniformly 2 h in duration. Threshold values were somewhat more variable for a positive glucose rise (Table 1B).Because only one study used an extensive time period, the impact of study duration on diagnostic accuracy was not evaluated for either test, although, it has been accepted that a prolonged LBHT is more sensitive.53

Genetic testing was done by PCR and most used temperature melting point for single polypeptide analysis with standard kits. The methods for genetic testing were fairly uniform. Six of the 19 studies also measured the GA-22018 polymorphism with this gene included in the kit.31, 33, 35, 36, 40, 43 However, only three could be analysed using inclusion criteria.35, 36, 43

Analysis of included studies

Breath hydrogen test (LBHT) for genotype C/T.  For each of the 17 studies, we calculated the sensitivity, specificity, positive and negative likelihood ratios, and DOR (available online: Supporting Information Table S1). Figure 2a, b show the Forest plots of sensitivity and specificity respectively. Table 2 shows the summary results that were obtained using random effect models, due to presence of heterogeneity among the studies. Both sensitivity and specificity showed strong heterogeneity with I-square being 78% and 87% respectively; while there was a small amount of heterogeneity for DOR (I-square = 39%). Overall the LBHT had a high accuracy. The summary sensitivity, specificity and DOR (95% CI) were 0.88 (0.85–0.90), 0.85 (0.82–0.87) and 119.8 (62.7–228.8) respectively. The summary ROC curve is shown in Figure 3. The AUC was calculated as 0.97 (s.e. 0.008), suggesting again a high performance of the LBHT. An analysis using Egger’s method28 showed that there might exist potential publication bias (P < 0.001). A Forest plot of DOR and Egger’s plot are also available online (Figures S1 and S2).

Figure 2.

 Forest plot of estimates of sensitivities (2a), specificities (2b) breath hydrogen test for genotype C/T-13910 (N17 for each figure).

Table 2.  Summary accuracy measures using random effect model: breath hydrogen test for genotype C/T (17 studies)
StudiesSensitivity (95% CI)Specificity (95% CI)LR+ (95% CI)LR− (95% CI)DOR (95% CI)
All 17 studies0.88 (0.85–0.90)0.85 (0.82–0.87)9.6 (5.3–17.4)0.1 (0.07–0.2)119.8 (62.7–228.8)
I-square (%)7887887839.3
Lacload 50 g, 11 studies0.92 (0.89–0.94)0.83(0.80–0.86)8.6(4.4–16.5)0.1(0.04–0.2)140.4 (79.5–248.0)
I-square (%)769091720
Lacload 20 g/25 g, 6 studies0.82 (0.78–0.86)0.95 (0.90–0.98)11.9(4.3–32.7)0.2 (0.1–0.3)77.4 (18.2–328.9)
I-square (%)6941397459
No children, 13 studies0.90 (0.87–0.93)0.91 (0.88–0.93)12.7 (6.2–26.0)0.12 (0.1–0.17)194.9 (101.4–374.3)
I-square (%)437878500
With children, 4 studies0.84(0.80–0.88)0.76(0.71–0.81)5.3(1.9–14.8)0.1(0.04–0.4)56.0(14.2–220.7)
I-square (%)9491939172
Without Mottes, 16 studies0.90 (0.87–0.92)0.85 (0.82–0.87)10.4(5.5–19.9)0.1 (0.07–0.2)142.5(85.7–237.2)
I-square (%)728890660
Figure 3.

 Summary receiver operating characteristic curve relating sensitivity to 1 − specificity for 16 studies comparing the breath hydrogen test to the C/T-13910 polymorphism (N17).

Assessment of clinical heterogeneity

A diagnostic threshold test was done using the Spearman correlation coefficient.29 This revealed a value of 0.37 (P-value = 0.15), suggesting there was no significant threshold effect.

Meta-regression analyses were used to examine possible associations between DOR and the three study-level characteristics outlined in the methods. There was no statistically significant effect between lactose load and DOR with relative DOR (95% CI) being 2.57 (0.61–10.88) when comparing studies with 50 g lactose to those with 25 g or 20 g. There was a weak but significant effect between DOR and Children presence [relative DOR of studies with vs. without children: 0.27 (0.07–0.96)], suggesting a loss of test accuracy.

Evaluation of possible relationships between latitude of study sites and sensitivity or specificity, showed significant correlations with sensitivity for 11 European studies: r = −0.7, r2 = 0.49, P = 0.018 and specificity for all 17 studies: r = −0.53, r2 = 0.281, P = 0.028 [correlations of latitude for 11 European cities with specificity, r = −0.36 were nonsignificant P = 0.16, and all 17 cities with sensitivity r = 0.036, P = 0.27]. There was no statistically significant correlation between latitude and DOR with relative DOR (95% CI) being 0.71 (0.14–3.65) for every unit increase in latitude.

Subgroup analyses were conducted according to the lactose load used in each study (50 vs. 20/25 g) and whether the study included children (yes or no). Table 2 shows that, for studies using 50 g of lactose, there were slight changes in summary sensitivity (0.92, 0.89–0.94) and specificity (0.83, 0.80–0.86); however, the DOR increased to 140.4 (79.5–248.0), suggesting an improved test accuracy compared to all 17 studies. For studies using 25/20 g of lactose, the summary sensitivity and specificity changed to 0.82 (0.78–0.86) and 0.95 (0.90–0.98). However, the DOR decreased to 77.4 (18.2–328.9) with moderate heterogeneity (I2 = 59%), suggesting a decreased test accuracy among these studies. As for the presence of children as subjects, four studies that involved children showed poor sensitivity, specificity and DOR. Removing these studies (13 adult studies analysed) showed increased sensitivity (0.90, 0.87–0.93), specificity (0.91, 0.88–0.93) and DOR (194.9, 101.4–374.3). There were significantly less heterogeneity among these studies for various accuracy measures (I2 = 0% for DOR).

Additional sensitivity analysis (Table 2) suggested that the study by Mottes et al.39 was an important outlier with a large effect on the overall estimate of DOR. Omission of this study resulted in a completely homogeneous group in DOR (DOR = 142.5, 95% CI 85.7–237.2, I2 = 0%). The Mottes study also stands out in the sROC analysis. The study by Waud et al. may also have contributed to a lesser extent to heterogeneity.40

Breath hydrogen test (LBHT) for genotype G/A. Table 3A shows the sensitivity, specificity, likelihood ratios and DORs for 3 studies, as well as the summarised results using random effect model. The summary sensitivity, specificity and DOR (95% CI) are 0.87 (0.79–0.93), 0.76 (0.67–0.83) and 40.53 (5.28–311.00) respectively. Overall, the Breath Hydrogen Test performs less accurately for genotype G/A compared to genotype C/T. The sensitivity, specificity, positive and negative likelihood ratios, and DOR of each of the three studies are available online (Table S2a).

Table 3.  Summary accuracy measures using (A) random effect model: breath hydrogen test for genotype G/A (three studies); (B) fixed effect model: lactose tolerance test for genotype C/T (five studies)
 Sensitivity (95% CI)Specificity (95% CI)LR+ (95% CI)LR− (95% CI)DOR (95% CI)
Accuracy measures0.87 (0.79–0.93)0.76 (0.67–0.83)3.6 (2.7–4.8)0.2 (0.04–0.6)40.5 (5.3–311)
I-square (%)875907959
 Sensitivity (95% CI)Specificity (95% CI)LR+ (95% CI)LR− (95% CI)DOR (95% CI)
Accuracy measures0.94 (0.90–0.97)0.90 (0.84–0.95)8.6 (5.2–14.3)0.08 (0.04–0.17)125.8 (50.7–312)
I-square (%)3500220

LTT for genotype C/T-13910. Table 3B shows the sensitivity, specificity, likelihood ratios and DOR for five studies, as well as the summarised results. As for heterogeneity, the Spearman correlation coefficient was found to be −0.1 (P = 0.873) suggesting no threshold effect. The I2 indexes are small for all measures suggesting no other source of heterogeneity. Thus, the summary results are presented using fixed effect model. The summarised sensitivity, specificity and DOR (95% CI) are 0.94 (0.90–0.97), 0.90 (0.84–0.95) and 125.8 (50.7–321.0) respectively. The sensitivity, specificity, positive and negative likelihood ratios, and DOR of each of the five studies are available online (Table S2b).


This report shows that LBHT is sensitive, specific and has a high DOR for the C/T-13910 European polymorphism. Heterogeneity is partly explained by dose of lactose, inclusion of children and perhaps latitude of study centres. While there were few studies with LTT, accuracy of this test was also good. Some caution may be needed to accept conclusions due to presence of publication bias.

An early comparison of indirect tests with intestinal lactase showed the LBHT to be superior to the LTT.8 However, intestinal lactase declines in a spotty fashion, especially in young subjects, making site of biopsy relevant.54 Also agreement is modest-to-moderate between LBHT and LTT8, 11, 12, 55 with overlapping modifiers of each test reducing reliability (Supporting Information, Table S3), questioning accuracy to predict geno/phenotypes. As such, reassessment of these tests with genetics is timely.

There are unique aspects to these tests as they are not used for classical disease detection. Positive and negative outcomes may be confusing as outlined in the introduction. When tests are applied to genetics of lactase, a dichotomous outcome is compared with three genotypes. Heterozygotes of lactase are generally digesters. The “sensitivity” of LBHT in the C/T-13910 refers to C/C genotype, and specificity for C/T and T/T alleles. In order to keep this pattern the LTT sensitivity is the negative test.

Secondly, LBHT measures bacterial and LTT, human response. Lactase does not increase after LNP status is achieved.56 However, continued ingestion of lactose by LNP subjects could convert a positive LBHT to negative, through bacterial adaptation.57 The impact of this process on testing LNP people has not been fully explored. Allelic predictions could be skewed by unrecognised adaptation, leading to lower sensitivity, if dairy food consumption continues.5 The LTT may be free of this feature, but has not been studied.

Thirdly, response to LBHT can be modified by contribution of methane gas. Hydrogen is converted to methane in an unknown portion of populations58, 59 and could result in apparent failure to produce hydrogen, increasing false negative outcomes also.40

The DOR is best applied when threshold effects contribute to heterogeneity,23, 24 but is independent of the above test features. The open-ended scale may be less sensitive to small variations in its dependent components. This is exemplified by different outcomes of meta-regression and sub-group analysis of lactose load with the LBHT. While overall DOR was nonsignificantly affected by altered doses (despite a 43% change), sub-group analysis did show expected improvement in sensitivity and specificity. There was also improvement when studies with children (age ≤ 17) were excluded. This improved the DOR by about 63%. LNP children have different rates of lactase loss, depending on ethnicity and race.2 Statistical significance may have been difficult to reach because of limited number of studies available (frequent in meta-analyses).

Clinical heterogeneity was also evaluated by a summary ROC curve. The studies by Mottes et al.39 and to a lesser extent Waud et al.40 contributed most to heterogeneity.

Mottes et al. included children and adult relatives for genetic comparisons. Our calculated sensitivity was low but specificity was higher. The authors looked for other polymorphisms to account for poor outcome of LBHT with genotypes other than C/C, but did not find any controlling LP status. A number of test negative, C/C subjects were noted also, possibly reflecting adaptation as discussed above. The study by Waud et al. evaluated genetics and lactose “sensitivity”. Sensitivity was 100% for genotype C/C but sensitivity and specificity were poor for other genotypes. Moreover, measuring methane gas added to predicting symptoms, although the point may be controversial.60, 61The explanation for altered effects in C/T and T/T genotypes is not clear, but this study defined lactose “sensitivity” as comparator rather than digestion with genetics and and included an unknown number of children. Also positive outcome of the test was defined as a rise above the nadir in the study period. Both these definitions require validation and may have had some effects on results.

The LTT also accurately predicts genotype for LP/LNP status, although, it is based on few publications. These results nevertheless, conflict with earlier findings.8

From a clinical perspective, current practices are confirmed. A 50 g lactose load is more sensitive for genotype, while a 25 g (more physiological) indicate specific lactose sensitivity. Alternatively, failure to elicit symptoms could convince patients not to exclude dairy foods, which may have nutritional consequences.62

The second variable affecting results is the inclusion of children. Although, we used a dichotomous outcome, two of the studies39, 42 included large number of cases. Therefore, in predicting genotype, the proportion of children in a study may sway expectations in outcome.

Interests in lactase distributions also involve the evolution of the dominant polymorphisms for LP status worldwide2, 3, 16–19, 50, 52 and relationships with disease risks.5, 63 Population migrations are a putative determinant of LP status and there is a distinct north to south decreasing distribution in Europe.2, 3, 19 Also migrations to new worlds in the last 5–6 centuries emanated from north-western Europe, making early evolutionary events there, epidemiologically relevant. Many diseases including cancers (breast, prostate, colon ovary, and others) and inflammatory bowel disease are noted to have north south reduction of risks (putatively related to sunshine and vitamin D).64–66 Coincidentally, these changes are also observed with lactase status5 and have now been linked, but mainly in Europe.63 This analysis may give some insight to the extent of the CT-13910 impact on LP status and may suggest presence of other polymorphisms without availability of specific kits. As suggested,19 the site of study (defined by latitude) did affect outcome with the current kit. However, either sensitivity (biased perhaps in Europe by LBHT and microbial effects) or specificity (interference by other polymorphisms perhaps) in overall studies was affected, but not both. Frequent, alternate polymorphisms contributing to outcome might have affected both.

Limitations include a lack of power, due to paucity of reports. Intestinal diseases can cause secondary lactose maldigestion with confounding. However, only one study included a small number of patients with these diseases.42 Also analysis does not consider ageing, which may increase breath hydrogen.67 Finally, there is publication bias, but there is a dissenting opinion, whether this should be considered in meta-analyses of diagnostic tests.68 Nevertheless, these findings prompt some caution in interpretation of results. In summary, the LBHT accurately reflects LP/LNP genotype. The DOR may be a blunt instrument with genetic tests and may be more useful in detecting large population differences. Latitude as a possible co marker for genetic variation has some impact on LBHT outcome. Explanations for this effect await further developments. While in this analysis the LTT appears as accurate for predicting genotypes, it is based on limited number of small studies and requires additional verification.


We thank Dr Alan Barkun for reviewing the manuscript and also Maria Bakirtztis RT for reviewing PCR methodologies of the different studies. Declaration of personal interests: Dr Szilagyi has served as a speaker, a consultant and an advisory board member for Abbott (Quebec, Canada). Declaration of funding interests: None.