Test of the child/adolescent Rome III criteria: agreement with physician diagnosis and daily symptoms


  • Published abstract: Part of this work has previously been presented at Digestive Disease Week 2012 and was published in abstract: van Tilburg, M.A.L., Squires, M., Blois-Martin, N., Leiby, A. & Langseder, A. (2012). Validation of the Child/Adolescent Rome III criteria. Gastroenterology, 142 (5) Suppl. 1, S48.

Address for Correspondence
Miranda van Tilburg, PhD. University of North Carolina, School of Medicine, Department of Gastroenterology and Hepatology, CB 7080, Chapel Hill, NC 27599-7080, USA.
Tel: +1 919 843 0688; fax: +1 919 843 2793;
e-mail: tilburg@med.unc.edu


Background  Establishment of the Rome criteria advanced diagnosis of children with Functional Gastrointestinal Disorders. The criteria were overhauled in 2006, but these revisions were never systematically tested. The aim of the current study was to assess psychometric properties of the childhood Rome III criteria and determine how well they agree with physician diagnoses and daily symptoms.

Methods  A total of N = 135 families from two pediatric gastroenterology clinics completed the Questionnaire on Pediatric Gastrointestinal Symptoms (QPGS- RIII). Half of the families completed the QPGS-RIII again in 2 weeks, the other half completed 2-week daily diaries. Children above the age of 10 also provided data (N = 64). Physician diagnoses were obtained from the medical records.

Key Results  Diagnoses: The most common diagnoses per child/parent report were Irritable Bowel Syndrome (IBS; 43–47%) and Abdominal Migraine (26–36%). The most frequent physician diagnoses were Functional Constipation (FC; 53%) and Functional Abdominal Pain (FAP; 29%).

Reliability: Moderate to substantial agreement was found between baseline and 2-week follow-up for most diagnoses (kappa = .19–.78) and between parent and child reports (kappa = −.04–.64).

Validity: There was low agreement between QPGS-RIII and physician diagnosis (kappa =−.02–.34) as well as diaries (kappa = .06–30).

Conclusions & Inferences  The Rome criteria have reasonable test–retest reliability and seem to be inclusive, as the majority of children obtain a diagnosis. However, validity is still an issue: The Rome criteria do not overlap well with physician diagnosis or daily symptoms. These issues will need to be addressed in future revisions of the Rome criteria.


Functional abdominal pain


Functional constipation


Functional dyspepsia


Functional gastrointestinal disorder


Irritable bowel syndrome


Functional gastrointestinal disorders in children are common. Approximately 8–9% of children suffer from chronic abdominal pain1 or constipation,2 and in more than 90% of these patients, no structural or biochemical abnormality can be found.3,4 These patients are considered to suffer from a functional gastrointestinal disorder (FGID). Historically, children with unexplained abdominal pain were diagnosed with Apley’s Recurrent Abdominal Pain,4 and functional constipation (FC) was diagnosed by the Ohio Criteria.5 These diagnostic categories were broad and presented several problems. Establishment of symptom-based criteria for pediatric FGIDS by the Rome committee in 1999, considerably advanced the diagnosis and study of children who suffer from FGIDs.

The Rome Criteria have been helpful in distinguishing subgroups of children with FGIDs and a handful of studies have shown initial validation of these criteria in a pediatric population.6–9 However, it has not been spared criticism and in 2006, the Rome criteria where overhauled.10 Several changes were made in Rome III, but some of the most important changes addressed the criticism that the previous Rome criteria were to narrow.11 These changes appear to be effective. Three independent studies have found that approximately 87% of children with FGID meet at least one of the Rome III criteria,12–14 which is a 20–50% increase compared with the Rome II criteria.13 Thus, compared to Rome II, the Rome III criteria seem to be more inclusive.

Despite this success, the reliability and validity of the Rome III criteria has not been systematically assessed. One study found that in the case of IBS, parental and child reports do not agree.15 In addition, agreement between pediatric gastroenterologists on the Rome criteria is only fair to moderate.16 The aim of the current study was to further validate the child/adolescent Rome III criteria in a population of children with functional gastrointestinal disorders. The study extends previous studies on the Rome III criteria by systematically studying both validity and reliability. We assess test–retest reliability, a form of reliability testing that has not been previously reported in the literature, and determine concordance between parent–child report. We also validate the questionnaire against daily diaries, and a physician diagnosis.

Materials and methods


We recruited consecutive patients between the ages of 4- and 18-years-old and their primary caregiver who presented at the pediatric gastroenterology clinics at UNC hospitals, NC, and Goryeb Children’s Hospital, NJ between September 2009 and April 2011. All patients presented for known or suspected functional gastrointestinal symptoms and included both new and returning patients.

Sample size was determined by using the previously reported prevalence rates for the three most common Rome II diagnoses (IBS 22.0%, Functional Constipation 19.2%, and Dyspepsia, 16.5%7) and an average of 44% agreement rate between respondents (physician, child, caregivers).6 A sample size of 63 subjects achieves 80% power at α = 0.01 for kappa statistics, to establish agreement between diagnosis for these three major categories. As only children aged 10 and above completed the questionnaires, we oversampled to obtain enough child data.

Study design

Families were approached about the study while in the clinic for their regularly scheduled appointments. All families who were interested in participating were consented and completed questionnaires during their visit. Caregivers completed questionnaires as well as children aged 10 and above. Half the families were randomly asked to complete the same questionnaire 2 weeks later while the other half were asked to complete daily diaries for 2 weeks. The diaries were given to the family in the clinic to be completed at home and the follow-up questionnaires were mailed to the family within 2 weeks of their original visit. The study was approved by the UNC and Goryeb Children’s hospital Institutional Review Boards, and consent was obtained from the parents as well as assent from children aged 10 and above.


Rome diagnoses  The Questionnaire on Pediatric Gastrointestinal Symptoms Rome III version (QPGS-RIII)10 assesses the symptom-based criteria for functional gastrointestinal disorders in children as defined by the ROME III criteria and is adapted from the Questionnaire on Pediatric Gastrointestinal Symptoms (QPGS)7,8 that measures Rome II criteria. Although no validation exists of the QPGS-RIII, there is data on the psychometric properties of its predecessor the QPGS. The QPGS possesses reasonable test–retest reliability with coefficients of 0.70 for 57% of the items.7 Concordance between parent and child was fair to good with Kappa interclass correlations from 0.4 to 0.7 on most items.6 Factor analyses confirmed the symptom groupings for abdominal pain-related disorders in the QPGS proposed by Rome II.7

Daily diaries  Two-week daily diaries of children’s symptoms were completed by all caregivers. For children aged 10 and above, both parent and child completed a diary. We used an existing diary that has been used in previous studies with parents and children aged seven and older.5,17 Three times a day, covering the morning-afternoon-evening period, children and their caregivers were asked to rate abdominal pain duration (in hours/minutes), pain intensity (10 points scale), pain interference with activities (four point scale), and pain location (on a diagram showing four abdominal quadrants as well as two separate locations, one around the belly button and one over the stomach). Pain ratings were averaged for each day. Subjects were also asked to record whether pain improved with a bowel movement (yes/no). In addition, families recorded information about the child’s bowel movements, including: (i) Stool consistency (hard, lumpy, smooth, mushy, watery) as measured by the Bristol stool scale18; (ii) Number of bowel movements in underwear; (iii) Number of bowel movements in the toilet; (iv) Withholding of bowel movement (yes/no); and (v) Passing of stools that obstruct the toilet (yes/no).

Data from the diaries were inspected for compliance with Rome III criteria for:

  •  IBS: Abdominal pain at least once per week and at least one episode of pain over the 2 weeks with improvement after defecation and/or change in stool frequency/form.
  •  Functional dyspepsia (FD): Pain at least once per week above belly button, no relief with defecation, no change in stools with pain.
  •  FAP: Pain at least once per week, and no diagnosis of IBS/FD.
  •  FC: At least two of the following symptoms per week: Two or fewer defecations, one or more episode of fecal incontinence, withholding of stools, hard bowel movements, stools that obstruct the toilet.

Physician diagnosis  At least 3 months after study enrollment, physician diagnoses were retrieved from the medical records. This allowed time to obtain results from possible medical tests to rule out organic disorders in new patients. All participating physicians were asked to double check the medical record diagnosis. They were provided with the Rome criteria for FGIDs, but did not have access to the questionnaire data.

Data analyses

Test–retest reliability of the Rome III diagnoses was obtained by calculating the agreement rate (Kappa) between Rome diagnoses at baseline and 2-week follow-up. Separate analyses were run for children and caregivers. As per Landis and Koch,19 agreement is identified as: poor (k < .00); slight (k = 0.0–0.20); fair (k = 0.21–0.40); moderate (k = 0.41–0.60); substantial (k = 0.61–0.80); or almost perfect (k = 0.81–1.00).

Kappa statistics were also used to test the agreement between caregiver and children (age 10 and above) on the QPGS-RIII as well as between QPGS-RIII and physician diagnosis. In cases where discrepancy exists between parent and child, paired t-tests were run to determine which individual items account of the discrepancy.


Subject recruitment and enrollment

A total of N = 135 families were recruited (N = 116 from NC and N = 19 from NJ) of which N = 64 children were above the age of 10. A total of 12 physicians and two nurse practitioners referred the patients to the study. As an organic cause for the symptoms was found in N = 17 patients, the final sample consisted of N = 118 families (88.1% in which mothers were the respondents, 60.4% girls, Mean child age = 10.6; SD = 4.2). Follow-up questionnaires were completed by N = 40 families (67.8% return rate) and N = 36 families completed daily dairies (61.0% return rate).

Prevalence of Rome diagnoses

The Rome category as per child/parent QPGS-RIII and physician is represented in Table 1. Children qualified for at least one FGID by Rome criteria based on 90.1% of parents and 82.3% of children QPGS-RIII reports.

Table 1. Rome categories by respondent
Rome categoryParentChildPhysician
Functional dyspepsia16 (13.6)%3 (5.3%)6 (5.1%)
Irritable bowel syndrome51 (43.2%)26 (45.6%)22 (18.6%)
Abdominal migraine43 (36.4%)15 (26.3%)1 (0.8%)
Functional abdominal pain5 (4.2%)3 (5.3%)34 (28.8%)
Functional abdominal pain syndrome1 (0.8%)00
Functional constipation31 (26.3%)9 (15.8%)63 (53.4%)
Non-retentive fecal incontinence1 (0.8%)00
Aerophagia17 (14.4%)11 (19.3%)0
Cyclic vomiting1 (0.8%)3 (5.3%)0
Rumination1 (0.8%)2 (3.5%)0

A minority qualified for one QPGS-RIII diagnosis (19.3% per child report, 11.9% parent report), about 40% qualified for two diagnoses, and 30% for three diagnoses. The most common overlap in QPGS-RIII diagnosis was between IBS and Abdominal Migraine. More than half of patients who met criteria for Abdominal Migraine, also met criteria for IBS based on both parent (53.5%) and child report (53.3%). Physicians rarely gave more than one diagnosis (6.7% of cases) and the most common overlap was between FAP and FC (N = 7).

Given the low occurrence of some Rome categories, all future analyses were performed only for those diagnoses identified in at least 10% of patients. These include FD, IBS, Abdominal Migraine, FAP, FC, and Aerophagia.

Test–retest of the QPGS-RIII

To establish test–retest reliability of the QPGS-RIII, Kappa was computed to examine the stability of the Rome criteria between baseline and 2-week follow-up. A total of N = 40 parents and N = 18 children completed the QPGS-RIII a second time. Table 2 illustrates agreement between QPGS-RIII at baseline and retest. Moderate to substantial agreement was found for meeting criteria for FD, Abdominal Migraine, and FC based on parent reports. For child report, substantial agreement was found for almost all categories (except Aerophagia). Given the low number of child follow-up questionnaires, these data should be interpreted with caution.

Table 2. Test–retest of Rome III questionnaire (QPGS-RIII)
Rome categoryParent (N = 40)
#diagnoses baseline/follow-up
Child (N = 18)
#diagnoses baseline/follow-up
  1. Given the low number of cases in which test–retest data are available on children, these results should be considered preliminary.

Functional dyspepsia = 7/= 6
= 2/= 1
Irritable bowel syndrome = 16/= 18
= 8/= 10
Abdominal migraine = 14/= 17
= 3/= 2
Functional abdominal pain = 1/= 2
= 0/= 1
Functional constipation = 11/= 7
= 2/= 3
Aerophagia = 6/= 7
= 2/= 4

Agreement between child-parent-physician diagnoses

As can be seen from Table 3, there was little agreement between QPGS-RIII and physician diagnosis. To find the reason for the discrepancy, we first determined if physician knowledge of Rome lead to better agreement. Self-rated physician knowledge of Rome was high in three physicians, nine were somewhat familiar, and none of the physicians rated their knowledge of Rome as low. Agreement of physician diagnosis with QPGS-RIII among physicians who were very familiar with Rome ranged from 0 to 0.3 (kappa) compared with 0 to 0.5 (kappa) for clinicians who were somewhat familiar with Rome. Next, we determined what QPGS-RIII criteria were met for the two most common physician diagnoses. For physician diagnosis of FAP, only one child met criteria for FAP by QPGS-RIII parent report. Instead, these patients most commonly qualified for Rome criteria of IBS (52.9% for parents and 57.1% for children report), Abdominal Migraine (44.1% parent and 28.6% children report), and FD (29.4% parents and 7.1% child report). For a physician diagnosis of FC, about one-third of patients also met the Rome criteria for FC (42.9% parents and 35% child report), but these patients were just as likely to meet Rome criteria for IBS (31.7% parents, 40% children report), or Abdominal Migraine (44.4% parents, 31.7% children report).

Table 3. Agreement between Rome criteria
Rome categoryParent–child kappaParent–physician kappaChild–physician kappa
  1. *Diagnosis not made by physician.

Functional dyspepsia.64.02−.06
Irritable bowel syndrome.44.16.03
Abdominal migraine.58**
Functional abdominal pain−.04−.02−.10
Functional constipation.34.34.34

Agreement between parent and child fared better. It was substantial for FD, moderate for IBS and Abdominal Migraine, and fair for FC (see Table 3). On the symptom level, paired t-tests did not show significant differences in reports of pain on the QPGS-RIII between parent and child. Significant differences in stooling consistency (poops softer with pain M = 1.3 parent vs M = 2.5 child, P < .001; poop harder with pain M = 1.5 parent vs M = 2.2 child, P < .005) and pattern (more poops with pain M = 1.3 parent vs M = 2.5 child, P < .001; fewer poops with pain M = 1.3 parent, M = 2.0 child, P < .05) were found for items associated with upper belly pain, but not with lower belly pain.

Agreement between QPGS-RIII and daily diaries

Only parental diaries were used as we had very small number of child diaries. Abdominal Migraine cannot be determined from the dairies. Agreement on the other Rome categories between diaries and child/parent/physician is reported in Table 4. Kappa value was fair between diary and parent report for FD only.

Table 4. Agreement between Rome criteria per diary and questionnaire
Rome categoryParent-diary kappaPhysician-diary kappa
  1. *Diagnosis not made by at least one method.

Functional dyspepsia.30*
Irritable bowel syndrome.19.05
Functional abdominal pain.06−.11
Functional constipation−.20−.05


The Rome criteria in children have been widely adopted in research, but validation has been sparse. In addition, adoption by clinicians has been slow as they perceive the criteria to be cumbersome and of limited usefulness to treatment.16,20 The changes made in Rome III may have ameliorated this situation, but validation efforts are needed.

The current findings suggest that the Rome III questionnaire has reasonable reliability. Test–retest reliability appeared to be fair to substantial for the most common diagnoses, and agreement between diagnoses based on parental and child report of symptoms was reasonable. Parents were least aware of their child’s bowel symptoms, but overall, parents had fairly good knowledge of their child’s symptoms. More studies are needed, especially with larger data for child reports, but initial results seem encouraging.

However, the Rome III criteria did not fare very well when compared to physician diagnoses. Physicians appeared to use fewer categories for diagnoses than those obtained from the Rome III questionnaires. For example, high rates of both Abdominal Migraine and Aerophagia were found based on Rome, but none of the physicians used these diagnostic categories. Interestingly, these Rome diagnoses almost always overlapped with another Rome diagnoses, most commonly IBS. Other studies have found an increase in reports of Aerophagia and Abdominal Migraine with Rome III as well and noted the overlap with IBS.12,13,21 The symptoms between the conditions are largely overlapping (intense pain, bloating) and the fact that none of the physicians diagnosed these conditions suggests that most patients may have wrongly received a Rome III diagnosis of Abdominal Migraine or Aerophagia or that the physicians lump symptoms of patients into broader categories and prescribe a limited number of treatments. A re-evaluation of the Rome criteria and/or the Rome III questionnaire may be needed.

Another diagnosis with large discrepancies between physician and the Rome criteria was FAP. Functional Abdominal Pain was the most common diagnoses made by physicians, but rarely found on the basis of the Rome III questionnaire. In fact, the majority of these patients met the Rome III criteria of IBS, FD, or Abdominal Migraine. This suggests that, despite knowledge and availability of Rome criteria, physicians may not use them consistently. It has been reported that the majority of pediatric gastroenterologists know about the Rome criteria, but only 39% use them.20 Physicians in our study were all somewhat familiar with the Rome criteria. Agreement between parents/children and physician did not improve with greater knowledge of Rome. Physicians may use FAP as a ‘catch all’ including many other abdominal pain-related FGIDs, as suggested by the definition of the American Association of Pediatrics and NASPGHAN.7 Previous studies have found better agreement for Rome II criteria6 and there may be some reasons why our study yielded lower numbers. Firstly, our study included a very low number of FAP patients and this may reflect our patient population. Secondly, in our study, physicians were given the Rome criteria, but were not required to use the Rome classification system or diagnostic criteria. However, other studies have found that even if physicians are asked to examine the same case, they vary widely in application of the Rome III criteria.16 This suggests that even though the Rome criteria are very well defined, there is a lot of room for clinician interpretation.

A combination of pain and constipation led to the most discrepancies in diagnoses between children and parents/physician. First, parents of children who complained of pain above the belly button were less aware of their child’s bowel symptoms; possibly overseeing the role of constipation in their child’s symptoms. Second, physicians in our study preferred to give a diagnosis of FC over a diagnosis of IBS in children with comorbid pain and hard stools. It has been reported that constipation is seen by many pediatric gastroenterologists as an ‘organic’ cause of abdominal pain20; which excludes by definition a ‘functional’ disorder such as IBS. Personal communication with physicians in our study corroborated this finding: a primary complaint of constipation with a secondary complaint of pain was thought to be FC: the pain was assumed to be a direct effect of the constipation and should disappear with proper treatment. This is a reasonable assumption as it has been found that 75% of children with a Rome III diagnoses of FC report abdominal pain.14 Furthermore, in a study among 1100 adults with constipation, the Rome criteria did not clearly distinguish IBS with constipation (IBS-C) from FC22: Almost 90% of IBS-C cases also qualified for FC and a little less than half of FC patients qualified for IBS-C. Not surprisingly, it was common for patients to change diagnostic criteria within a 12-month period. Clearly, these are nuances in diagnosis that need to be clarified within Rome.

Even if Rome III does not align with physician diagnosis, this does not necessarily mean validity is lacking. Physicians may choose to ignore symptoms they regard unimportant, or include other symptoms in their diagnosis. In addition, they may have based their diagnosis on additional data such as abdominal X-rays showing large impaction. Thus, physicians may have access to more information than that was elicited with the Rome III questionnaire. Unfortunately, there is no ‘gold-standard’ against which to validate the Rome III criteria. Therefore, we evaluated not only how the Rome criteria fared against physician diagnosis but also against daily diaries. Agreement with daily dairies and the Rome III questionnaire was fair for IBS and FD, but poor for FAP and FC. Poor agreement between Rome III criteria derived from diaries and questionnaires has been found previously.15 These authors suggested that the poor agreement is due to parents comparing their child’s daily symptoms not with symptoms on previous days, but against a ‘norm’ for appropriate stooling behaviors. Obviously, this would be a problem with any form of recall and more ingenious measurements may have to be developed.

Our study has several limitations. First, the data was collected only among patients in a tertiary care center, limiting generalizability to primary care. Secondly, although we had a fairly sizeable sample, the number of children above the age of 10 was limited. For some diagnoses (Functional Dyspepsia and FAP) only three children met Rome criteria (see Table 1). Any findings based on such a small group should be interpreted with great caution. Replication in a larger sample of children is clearly needed. In addition, some disorders were uncommon in this population. Therefore, we excluded diagnoses made in <10% of cases from further analyses. Still, small numbers remain and this may influence our measure of agreement. The kappa coefficient is sensitive to the prevalence of the disease in a population.23,24 Kappa measures the level of agreement beyond chance that two raters have the same outcome. If the prevalence is high, the chance that two people agree is high and thus kappa will be reduced. The opposite may be true as well. Thus, it should be cautioned against directly comparing kappa of disorders with very different prevalence. Thirdly, although the physicians were instructed on the use of Rome criteria, we cannot ascertain from our current study if they chose to follow the criteria. In addition, none of these physicians were experts on the treatment of FGID in children, but they all were at least somewhat familiar with the Rome criteria. The fact that physicians know about the Rome, but were not required to use it, reflects how the Rome criteria are applied in real life situations outside of clinical research and thus adds to the generalizability of our results.

The Rome foundation has begun planning for Rome IV. The current findings can help the committee in thinking through the complexities and challenges of the Rome criteria. The changes made from Rome II to Rome III have been very positive, making the Rome criteria more inclusive: the majority of patients in the current study received a Rome diagnosis based on the Rome III questionnaire (90% based on parent report of symptoms, more than 80% by child report). However, the main challenge for Rome seems to be to make it clinically relevant. We are in need of data suggesting how we can improve the use of Rome among physicians. Based on our current results, some questions that need to be answered may be: Are the multiple categories needed for clinical care and diagnoses and how can they be consistently applied? How do the various categories relate to clinical care and treatment of FGID? Clarification is in particular needed for the overlap between pain and constipation. This combination is very common, but different informants (child, parent, or physician) may emphasize one symptom above the others leading to confusion about the proper diagnosis. Is this child suffering from FAP, IBS, or FC? And does it make a difference in our treatment approach? In addition, the criteria for Abdominal Migraine and Aerophagia may need to be re-evaluated. These conditions appear to be over-reported in Rome III because of symptomatic overlap with IBS and FD.

In conclusion, the Rome III criteria have at least moderate reliability characterizing symptoms, but do not validly reflect the diagnosis of the physician. Some of the suggestions above may ameliorate this situation.


This study was supported by a Rome Foundation Research Award.

Conflict of Interest

The authors have no competing interests.

Author contribution

MVT had primary responsibility of all aspects of the study including: Obtaining grant support, study design, data collection, data analysis, and writing of the manuscript; MS collected the data in NC, coordinated interaction between data collection sites, and provided input on writing of the manuscript; NB provided input to study design, recruitment, and data collection as well as interpretation of results; AL had primary responsibility for the study in NJ; She also assisted in manuscript writing and interpretation of results; AL collected the data in NJ and provided input on writing of the manuscript.