Toward a valid definition of gout flare: Results of consensus exercises using delphi methodology and cognitive mapping


  • W. J. Taylor,

    Corresponding author
    1. University of Otago and Wellington Regional Rheumatology Unit, Hutt Valley District Health Board, Wellington, New Zealand
    • Department of Medicine, Wellington School of Medicine and Health Sciences, University of Otago, PO Box 7343, Wellington 6242, New Zealand
    Search for more papers by this author
    • Dr. Taylor has received consulting fees (less than $10,000) from Xoma.

  • R. Shewchuk,

    1. University of Alabama at Birmingham
    Search for more papers by this author
  • K. G. Saag,

    1. University of Alabama at Birmingham
    Search for more papers by this author
    • Dr. Saag has received consulting fees (less than $10,000 each) from TAP, Takeda, Savient, and Novartis.

  • H. R. Schumacher Jr.,

    1. University of Pennsylvania School of Medicine and Veterans Affairs Medical Center, Philadelphia
    Search for more papers by this author
    • Dr. Schumacher has received consulting fees (less than $10,000 each) from TAP, Takeda, Savient, Pfizer, Merck, Regeneron, Xoma, and Ipsen.

  • J. A. Singh,

    1. Minneapolis VA Medical Center and University of Minnesota, Minneapolis
    Search for more papers by this author
    • Dr. Singh is recipient of investigator-initiated research grants from TAP and Savient for a different project.

  • R. Grainger,

    1. Malaghan Institute of Medical Research and Wellington Regional Rheumatology Unit, Hutt Valley District Health Board, Wellington, New Zealand
    Search for more papers by this author
    • Dr. Grainger has received honoraria (less than $10,000) from Merck Sharp & Dohme.

  • N. L. Edwards,

    1. University of Florida, Gainesville
    Search for more papers by this author
    • Dr. Edwards has received consultancies, speaking fees, and/or honoraria (less than $10,000) from Regeneron and (more than $10,000 each) from Savient and Takeda.

  • T. Bardin,

    1. Hôpital Lariboisiére, Paris, France
    Search for more papers by this author
  • R. W. Waltrip,

    1. Savient Pharmaceuticals, East Brunswick, NJ
    Search for more papers by this author
    • Dr. Waltrip is a stockholder (less than $10,000) in Savient.

  • L. S. Simon,

    1. Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts
    Search for more papers by this author
    • Dr. Simon is a member of the Board of Directors of Savient, is a paid regulatory consultant for Leerink Swann Investment Bank, and has received consulting fees (less than $10,000 each) from AAI, Affinergy, AstraZeneca, Abraxis, Alpha Rx, Nuvo/Dimethaid, Roche, Pfizer, Novartis, PLx, Hisamitsu, LAB, Dr.Reddys, Biosense, Avanir, Cerimon, Alimera, Nomura, Luxor, Parexel, Nitec, Bayer, CombinatoRx, Rigel, Chelsea, Regeneron, Genelabs, Cypress, SNBL, SkyePharma, Proctor and Gamble, Savient, EyeGate, NicOx, Fidelity, BioCryst, Extera, Wyeth, Anesiva, Solace, Puretech Ventures, Puretech Development, White Mountain, TAP, Abbott, Cell Therapeutics, Ometor, Jazz, Schwarz, ProEthic, Takeda, Teva, Zydus, Proprius, Savient, Alder, Cure, Cellegy, ChemoCentryx, McKesson, DiObex, Sepracor, Purdue, Serono, Coley, MedImmune, Altea, Neuromed, Polymerix, Talagen, Tigenix, Millenium, IDM, Antigenics, GPC Biotech, Forest, Genzyme, Acusphere, and CaloSyn.

  • R. Burgos-Vargas

    1. Hospital General de México, Universidad Nacional Autónoma de México, México City México
    Search for more papers by this author
    • Dr. Burgos-Vargas has received consultancies, speaking fees, and/or honoraria (less than $10,000 each) from Abbott, Roche, Schering Plough, and Wyeth.



To identify, in people known to have gout, the testable, key components of a standard definition of gout flare for use in clinical research.


Consensus methodology was used to identify key elements of a gout flare. Two Delphi exercises were conducted among different groups of rheumatologists. A cognitive mapping technique among 9 gout experts with hierarchical cluster analysis provided a framework to guide the panel discussion, which identified the final set of items that should be tested empirically.


From the Delphi exercises, 21 items were presented to the expert panel. Cluster analysis and multidimensional scaling showed that these items clustered into 5 concepts (joint inflammation, severity of symptoms, stereotypical nature, pain, and gout archetype) distributed along 2 dimensions (objective to subjective features and general features to specific features of gout). Using this analysis, expert panel discussion generated a short list of potential features: joint swelling, joint tenderness, joint warmth, severity of pain, patient global assessment, time to maximum pain, time to complete resolution of pain, an acute-phase marker, and functional impact of the episode.


A short list of features has been identified and now requires validation against a patient- and physician-defined gout flare in order to determine the best combination of features.


There has been renewed interest in the treatment of gout with recently reported randomized controlled trials (RCTs) of new agents, including etoricoxib (1), lumiracoxib (2), febuxostat (3), and PEGylated-uricase (4). These studies have highlighted the importance of a gout flare as an indicator of disease activity. Gout flares have been considered in these studies in various ways: as entry criteria for studies of acute gout, as adverse effects of urate-lowering therapy in the early phase of treatment, and as a marker of benefit of therapy in chronic gout. However, the operational definition of a gout flare has not yet been properly validated, and largely has been defined ad hoc in many of the reported studies. These RCT studies appear to rely upon patient report or individual clinician-investigators to identify when a flare has occurred or when a flare requires treatment, without explicit criteria or descriptions of features that constitute a flare (3, 5).

Despite the importance of flares as a critical outcome measure in gout, the definition of flares across 10 recent studies of gout flaring are all different, limiting the comparability of results (1–3, 5–11). The plurality of flare definitions creates a body of literature that limits the ability of the clinician to make a sensible selection of treatment, and hinders clinical researchers in designing future trials. Furthermore, the modification of criteria from one study to the next suggests that the existing methodologies for flare definition are inadequate.

Therefore, we sought to condense and refine the various elements of flare definition into one that may be consistently used across different clinical research studies. As a first step, this study reports the results of the consensus exercises, which used Delphi methodology and cognitive mapping to identify a short list of potential elements that may constitute gout flare criteria. It is important to emphasize that such a list of elements is explicitly designed to be applied to patients already known to have gout. In other words, the elements may not necessarily be required to have high diagnostic power in discriminating between gout and nongout in people with undifferentiated acute arthritis presenting for the first time. A subsequent study was planned to empirically determine the accuracy of candidate definitions of flare composed of the elements that were identified from this work.


The overall approach for determining a valid definition of flare in people known to have gout is illustrated in Figure 1. Briefly, the first stage involved identification of potential elements of a flare definition that were then refined into a more manageable number using face-to-face discussion that was structured by a cognitive mapping exercise. The second stage (not reported in this article) involved testing combinations of identified elements in a cohort of patients with gout during routine clinical care.

Figure 1.

Flow chart of the study. RCT = randomized controlled trial.

Delphi survey.

Delphi consensus methodology is a structured approach to identify areas of agreement or disagreement among a panel of experts concerning questions to which little or no empirical data can conclusively guide decision making. It is characterized by anonymous rating of agreement or disagreement with the proposition of interest, followed by feedback of the average group response and spread of responses to the participants. The participants are then invited to resubmit their opinion in light of the group response, and the process is continued iteratively until no further increase in agreement is apparent. The same individuals participate in each round.

Rheumatologists with an interest in gout from the Outcome Measures in Rheumatology Clinical Trials (OMERACT) gout special interest group and the New Zealand Rheumatology Association were invited to participate in a Delphi consensus survey by e-mail (12). Thirty-five rheumatologists initially agreed to participate, 10 from the New Zealand Rheumatology Association and 25 from the OMERACT gout special interest group (20 from the US and 1 each from Thailand, France, the UK, and Spain). The first Delphi round requested that participants nominate items that they considered relevant in a free-text format. Subsequent rounds requested that each nominated item be rated for appropriateness on a scale of 1 (highly inappropriate) to 9 (highly appropriate). Items for which there was agreement concerning appropriateness (median >6) or inappropriateness (median <4) were not rated again in subsequent rounds. The scoring rounds were discontinued when there was agreement for each item. Agreement was defined using the Research and Development/University of California at Los Angeles (RAND/UCLA) Appropriateness Method (less than unity indicates agreement) (13). The RAND/UCLA Appropriateness Method is calculated from the interquartile range of the appropriateness ratings (30th to 70th percentiles), adjusted by a factor derived from experimental comparisons of the index with agreement patterns observed in panel decision making.

A second Delphi survey was conducted among individuals identified from the author list of published articles relating to “gout” found via a Medline search and a search of the abstracts from the European League Against Rheumatism and American College of Rheumatology (ACR) 2005 and 2006 scientific meetings. Authors for whom a valid e-mail address could be identified were invited to participate in a Delphi survey using an online Web-based survey format (n = 154, with 26 respondents). The items identified from the previous Delphi exercise were used, and participants were asked to rate each item for its need to be included in a definition of gout flare on a rating scale from 1 (extremely necessary) to 7 (extremely unnecessary). It was made clear that these items referred only to people for whom a diagnosis of gout was already confirmed, and participants had to endorse this notion before proceeding through the survey. Items for which the median rating was 4 (neutral) or for which there was disagreement (RAND/UCLA Appropriateness Method of more than unity) were rerated in subsequent iterations.


Participants of the second Delphi survey were also asked to present a short questionnaire to patients with gout whom they encountered during their next clinic visit. This questionnaire asked patients whether any of a list of items that would potentially be noticed by a patient were relevant to how they determined that they were having a gout flare. These items were derived from the same list of items presented to the Delphi panel. Additional, free-text items that were not explicitly listed were also solicited.

Cognitive mapping and expert panel consensus.

A modification of a nominal group technique was used to further refine the list of potential flare elements into a manageable number for clinical research. A nominal group technique aims to identify the overall opinion of a group using a similar philosophy to the Delphi technique, in that opinions and thoughts are expressed by group members anonymously (often by writing these down rather than by discussion). This is done in order to prevent the originator of individual thoughts from becoming known. This anonymous gathering of ideas results in the loss of the synergistic benefits of brainstorming, in which people build from each other's ideas. Therefore, the initial phase of this exercise employed an anonymous expression of opinion, which was additionally structured using the cognitive mapping process described below. However, in a departure from the nominal group technique, the expert panel used the results from the cognitive mapping analysis to structure subsequent direct discussion among the members.

We examined the gout flare elements that had been identified from the 2 Delphi surveys processed in greater detail using a cognitive mapping process. An expert panel consisting of 8 rheumatologists (1 also had gout) and 1 industry representative participated in this process during 2 consecutive days of face-to-face meetings held in Barcelona in June 2007 (the Barcelona panel). The results from the cognitive mapping served as the basis of a facilitated discussion regarding the specific elements that should be selected for further validation to define gout flare in subsequent empirical studies.

Cognitive mapping involves a sequence of data collection and analytic procedures for deriving an empirical representation of the cognitive structure that respondents use to organize stimuli (14–16). In order to properly come to a clear empirical understanding of the nature of gout flare that could be used to inform selection of the key characteristics of this phenomenon, cognitive mapping was chosen as a useful technique. It was predicted that once the phenomenon was properly understood, discussion to select the key characteristics of gout flare would be more rational. For example, this approach has been used to organize and understand the problems that caregivers experience when caring for a person with severe physical disability (15). In that study, a list of problems that were identified from focus groups was sorted into piles based on their similarity, and ranked for importance within each pile. The aggregated data was analyzed using multidimensional scaling and hierarchical cluster analysis to provide evidence that caregivers think about problems along 3 dimensions: centeredness, relationship demands, and caregiver burden. Within these dimensions, clusters of problems were identified, representing basic needs, perceived constraints, caregiver challenges, patient resentment, patient withdrawal, and patient intrapsychic adjustment.

Prior to the face-to-face meeting, the Barcelona panel rated a set of gout flare elements that had been identified as important by the second Delphi survey respondents. Based on this same set of elements, we conducted an unforced card sorting exercise (Q-sort) (17). For this exercise, the Barcelona panelists were instructed to sort 21 index cards, each of which was labeled with a gout flare element, into an unspecified number of card piles based on their perception of the element similarities. Completed card sorts were aggregated across all panelists to form a group co-occurrence matrix. This matrix indicated the number of times each element was sorted together with other elements and provided an empirical basis for analyzing the perceived similarities (distances) among all pairs of elements.

The similarity measures derived from the group co- occurrence matrix were modeled using a nonmetric multidimensional scaling (MDS) program, ALSCAL (SAS Institute, Carey, NC), based on squared Euclidean distances (18). The MDS results empirically represent the expert panel's perceptions of the relative similarity of gout flare elements as a set of interelement distances plotted within a derived multidimensional space. The decision criteria used by the expert panel in its construal of the relative similarities between all pairs of gout flare elements is empirically reflected as the set of axes that define the multidimensional space.

Hierarchical cluster analysis using Ward's method was also used in our cognitive mapping process to identify exclusive sets of homogenous (thematically consistent) elements. The coordinates defining the location of each element along each axis within the derived multidimensional space provided the data for this cluster analysis (19).

The results of MDS and cluster analyses can have different interpretations, in that 2 objects can appear in close proximity in an MDS space and yet be assigned to different clusters. Moreover, an object can appear on one extreme of an MDS space and yet be clustered with objects on the opposite end of that space. Although the results obtained from an MDS and cluster analysis of the same similarity measures can appear contradictory, it should be noted that the 2 approaches have different but complementary purposes. The primary focus of MDS concerns relative ordering of items along a continuum. The interpretive goal of MDS, then, is to discern meaning from the ordering of items along each dimension. Conversely, the principal task of cluster analysis is to assign stimulus items to exclusive membership categories. The results from the combined MDS/cluster analysis can be represented geometrically as a map that reflects different aspects of perceived similarity of the elements. In general, pairs of perceptually similar elements (i.e., those frequently sorted together) are represented as points that are relatively closer on the map than elements viewed as dissimilar.

To identify the most appropriate structure underlying our data, we compared the change in the STRESS (the square root of a normalized residual sum of squares) and RSQ (squared correlation index) fit indices across models of increasing dimensionality (1 to 3 dimensions). The STRESS index ranges between 0 and 1, with higher values (>0.20) indicating an unacceptable fit. This measure indexes the level of model misfit in terms of the discrepancies (errors) between the actual distance data (observed similarities/dissimilarities of pairs of elements) and the modeled representation of those data (a map of interpoint distances between pairs of elements). The RSQ reflects the proportion of variance in the data explained by the modeled distances. Increasing dimensionality always leads to improvements of fit, but often at the expense of interpretability (18). Therefore, the substantive interpretability of each solution also was considered in the model selection process. MDS solutions were interpreted by examining how elements arrayed along each dimension, and by contrasting the meaning of elements oriented at the extreme borders of each dimension.

In the final step of the cognitive mapping process, the derived clusters were weighted by the mean importance ratings for the set of gout flare elements assigned to a given cluster. The mean importance rating for each element within the derived clusters was also calculated. The importance ratings from the second Delphi survey and from the Barcelona panelists provided separate weighted estimates of cluster and element importance.

This analysis was presented to the Barcelona panel and was then used to explicitly facilitate discussion in order to arrive at a consensus decision regarding which items should be selected for further validation against a patient- and treating physician–determined gout flare state during routine assessment in a subsequent empirical study.


First Delphi survey.

Forty-three items were nominated in free-text format. Twenty-two replies (66%) were received in the first scoring round. There were 6 items for which there was agreement that the item should be included in a gout flare definition (median score 7–9), and 9 items for which there was agreement that the item should not be included in a gout flare definition (median score 1–3). The remaining 28 items were rescored in a third round. This produced agreement that each of these items was neither appropriate nor inappropriate (median score 4–6) (Table 1).

Table 1. Results of first Delphi exercise: 43 potential items for definition of gout flare*
Definitely appropriate (median score 7–9)Neither appropriate nor inappropriate (median score 4–6)Definitely inappropriate (median score 1–3)
  • *

    Items scored from 1 (highly inappropriate) to 9 (highly appropriate). VAS = visual analog scale; MSU = monosodium urate monohydrate; MTP = metatarsophalangeal; CPPD = calcium pyrophosphate dehydrate.

Swelling of the affected jointCurrent level of pain greater than a specific thresholdNo better explanation
Redness of the affected joint Fever
Marked tenderness of the affected  jointA change in pain on a 10-cm VAS of >3 from  baselineGlobal patient status reduced to a  defined level
MSU crystals demonstrated from  the jointLower limb joint affectedGlobal physician status reduced to a  defined level
 Sepsis excluded by joint aspiration and culture 
History of goutReduced range of motion of affected jointCytokine profile
Maximum pain within 4–12  hoursWarmth of skin overlying the affected jointUltrasound evidence of tophi
 Raised serum urateNo chrondocalcinosis on x-ray
 Joint affected typical of gout (1st MTP, midfoot,  ankle, knee)Early morning stiffness lasting >30  minutes
 Risk factors present such as diuretic therapyElevated platelet count
 Gouty erosions on x-ray 
 Presence of tophi 
 Inflammatory cells present in fluid aspirated  from affected joint 
 Elevated acute-phase reactants 
 Moderate response to standard gout therapy 
 Acute onset 
 Typical precipitating event (acute illness, renal  failure, trauma, dehydration, diet, or drugs) 
 Asymmetric joint involvement 
 Fewer than 5 involved joints 
 Marked impairment of function 
 Absence of CPPD or other crystals in joint fluid  (except MSU) 
 Resolution within 3–14 days 
 Recurrent pattern of attacks 
 Resolution untreated within a week 
 Increase in pain of a joint that has been affected  by gout before 
 Similarity to previous flares 
 Waking with pain at night 
 Pain prevents walking 

Second Delphi survey.

We obtained e-mail addresses for 160 authors of gout-related publications over the previous 2 years. Of these, 154 were valid e-mail addresses. A response rate of 17% (26 of 154) was observed for the first iteration of 33 items. There were 2 subsequent iterations, which led to 21 items (64%) being rated as necessary (median rating 1–3), 11 (33%) as unnecessary (median rating 5–7), and 1 as undecided (median rating 3–5). The undecided item was “fewer than 5 involved joints.” There remained some disagreement over one item, “inflammatory cells in synovial fluid,” with a disagreement index of 1.02 (Table 2).

Table 2. Results of second Delphi exercise: 33 potential elements for the definition of gout flare, in rank order of median score*
Definitely necessary (median score 1–3), original element numberingDefinitely unncessary (median score 5–7)
  • *

    Median score shown in parentheses. Elements scored from 1 (highly necessary) to 7 (highly unnecessary). See Table 1 for definitions.

14. Marked tenderness of the affected joint (1)Typical precipitating event (acute illness, renal failure, trauma, dehydration, diet, or drugs) (5)
11. Swelling of the affected joint (1.5) 
10. Maximum pain within 4–12 hours (2)Absence of CPPD or other crystals in joint fluid (except MSU) (5)
13. Acute onset (2)Lower limb joint affected (5)
 1. Similarity to previous flares (2)Sepsis excluded by joint aspiration and culture (5)
18. Redness of the affected joint (2.5)Loukocytosis (5)
12. Moderate response to standard antiinflammatory gout therapy (2.5)Risk factors present such as diuretic therapy (5)
 Raised serum urate (5.5)
 2. Recurrent pattern of attacks (2.5)MSU crystals demonstrated from the joint (6)
19. Current level of pain greater than a specific threshold (3)Gouty erosions on x-ray (6)
 9. A change in pain on a 10-cm VAS of >3 from baseline (3)Presence of tophi (6)
 Resolution untreated within a week (6)
 4. Reduced range of motion of affected joint (3) 
 3. Warmth of skin overlying the affected joint (3) 
17. Joint affected typical of gout (1st MTP, midfoot, ankle, knee) (3) 
20. Inflammatory cells present in fluid aspirated from affected joint (3) 
21. Elevated acute-phase reactants (3) 
15. Asymmetric joint involvement (3) 
 6. Marked impairment of physical function (difficulty with dressing, walking, showering, etc.) (3) 
 8. Resolution within 3–14 days (3) 
 5. Increase in pain of a joint that has been affected by gout before (3) 
 7. Waking with pain at night (3) 
16. Pain prevents walking (3) 

Patient perspective.

This survey had 20 respondents. Their median age was 51 years (range 34–76 years), 85% were men, median disease duration was 10 years (range 2–23 years), and 63% had tophaceous gout. All patient-reported items that were reviewed were considered relevant by >80% of patients (Table 3). Additional symptoms that patients reported as indicating a flare of gout included: “sometimes feverish,” “pain after eating meat with red wine,” “sharp pain first then burning pain,” “foot needs to hang outside bed at night,” and “joint is shiny.”

Table 3. Endorsement of items by patients, as indicating a gout flare
ItemPatients, %
Affected joint is swollen84
The affected joint is red84
The affected joint is extremely tender to touch100
The pain is at its worst very quickly (4–12 hours)95
The pain is much worse than usual95
The affected joint is the knee, ankle, foot, or toe95
The affected joint is warm to touch84
The pain gets better with my usual gout treatment95
It stops me from doing usual activities96
I can't walk during the attack96
It gets better within 3–14 days79
It was very similar to other attacks of gout96

Cognitive mapping exercise.

The cognitive map based on the 2-dimensional solution is shown in Figure 1. In contrast with the unacceptable fit observed for the 1-dimensional solution (STRESS = 0.301, RSQ = 0.746) the 2-dimensional solution provided an acceptable fit to the data (STRESS = 0.137, RSQ = 0.905) and, relative to the 3-dimensional solution (STRESS = 0.137, RSQ = 0.905), also afforded a more meaningful interpretation.

Dimension I (i.e., the horizontal axis) is anchored on the extreme left by an aggregation of elements of nonspecific joint inflammation that includes swelling of the affected joint, redness of the affected joint, and warmth of skin overlying the affected joint. Oriented at the opposing end of this dimension is a collection of elements more specific to gout, including current level of pain greater than a specific threshold, change of >3 on a 10-cm pain visual analog scale (VAS), similarity to previous flares, and recurrent pattern of attacks. Situated between the extremes are elements that include asymmetric joint involvement, marked impairment of physical function, and pain preventing walking. In view of the substantive and clinical underpinnings of both of the anchoring elements and the way in which the other elements were arrayed from left to right, the Barcelona panel construed this dimension as a continuum of element specificity.

Dimension II (i.e., the vertical axis) is anchored at the upper boundary by a set of more clinically objective elements that include elevated acute-phase reactants, inflammatory cells present in joint fluid, and joint affected typical of gout. The extreme lower boundary of this dimension is anchored by elements that were viewed as somewhat subjective, including current level of pain greater than a specific threshold, change of >3 on a 10-cm pain VAS, and waking with pain at night. Elements falling between these extremes include acute onset, resolution within 3–14 days, similarity to previous flares, increase in pain of a previously affected joint, and recurrent pattern of attacks. Based on their understanding of the anchoring elements and how individual elements were arrayed from bottom to top, the Barcelona panel interpreted this dimension as a reflection of element objectivity.

The hierarchical cluster analysis revealed 5 distinct clusters to which each of the 21 gout flare elements was assigned exclusive membership. Because no assumptions can be made about the distribution of the data used in this analysis, generally accepted statistical tests for assessing the adequacy of fit are not available (19). Our decision to interpret a 5-cluster solution was informed by examining the pattern in the agglomeration coefficients indicating which strategies were joined to form a cluster at different stages in the clustering, and by visually inspecting the dendrogram and icicle plot (not shown). The 5 clusters resulting from the hierarchical cluster analysis are superimposed on the cognitive map (Figure 2). The 5 clusters were interpreted by the Barcelona panel on the basis of the substantive consistency among the elements comprising each cluster.

Figure 2.

Two-dimensional scatter plot for clusters and attributes within clusters: I, gout archetype; II, stereotypical nature; III, pain; IV, symptom severity; and V, joint inflammation. Each data point represents a flare element. See Table 4 for element definitions.

The 5 elements within cluster I (joint affected typical of gout, asymmetric joint involvement, inflammatory cells present in joint fluid, elevated acute-phase reactants, and moderate response to standard gout therapy) were viewed by the Barcelona panel as indicative of the gout archetype. Relative to other clusters, the elements within this cluster were, in the aggregate, rated as least necessary by both the second Delphi respondents (mean ± SD score 3.26 ± 0.99) and the Barcelona panel (mean ± SD score 3.90 ± 1.08) (Table 4).

Table 4. Descriptive statistics (rating scores) for clusters and elements within clusters
Clusters and elementsDescriptionRangeMean ± SD
Cluster VJoint inflammation1.00–4.252.09 ± 0.72
 14Marked tenderness of the affected joint1–51.54 ± 0.95
 11Swelling of the affected joint1–31.62 ± 0.70
 3Warmth of skin overlying the affected joint1–52.46 ± 1.10
 18Redness of the affected joint1–72.73 ± 1.43
Cluster IVSeverity of symptoms1.50–5.253.00 ± 1.01
 10Maximum pain within 4–12 hours1–52.46 ± 1.21
 4Reduced range of motion of affected joint1–72.96 ± 1.66
 6Marked impairment of physical function1–73.23 ± 1.56
 16Pain prevents walking2–63.35 ± 1.26
Cluster IIStereotypical nature1.20–4.002.62 ± 0.69
 13Acute onset1–51.88 ± 1.03
 1Similarity to previous flares1–62.38 ± 1.27
 2Recurrent pattern of attacks1–62.73 ± 1.15
 8Resolution within 3–14 days1–62.96 ± 1.31
 5Increase in pain of a previously affected joint1–63.12 ± 1.24
Cluster IIIPain1.00–5.333.22 ± 1.15
 9Change of more than 3 on 10-cm pain visual  analog scale1–62.92 ± 1.55
 19Current level of pain greater than a specific  threshold1–63.08 ± 1.38
 7Waking with pain at night1–73.65 ± 1.41
Cluster IGout archetype1.20–5.203.26 ± 0.99
 12Moderate response to standard gout therapy2–52.88 ± 1.11
 15Asymmetric joint involvement1–73.23 ± 1.58
 17Joint affected typical of gout1–73.27 ± 1.61
 20Inflammatory cells present in joint fluid1–73.42 ± 2.04
 21Elevated acute-phase reactants1–73.50 ± 1.66

The 5 elements belonging to cluster II (acute onset, resolution within 3–14 days, similarity to previous flares, increase in pain of a previously affected joint, and recurrent pattern of attacks) were considered a reflection of the stereotypical expression of gout. As a group, the average ratings for elements in this cluster were ranked second by the Delphi respondents (mean ± SD score 2.62 ± 0.69) and third by the Barcelona panels (mean ± SD score 2.90 ± 0.89) in terms of being necessary for defining gout flare. After their initial formation, clusters I and II were joined to form the higher order or macro cluster, with the elements from cluster I generally located in objective region of dimension II and elements from cluster II situated toward the general end of dimension I.

Cluster III was simply labeled pain based on the thematic consistency of its 3 elements (current level of pain greater than a specific threshold, change of more than 3 on 10-cm pain VAS, and waking with pain at night). Cluster III was ranked fourth by both the Delphi respondents (mean ± SD score 3.22 ± 1.15) and by the Barcelona panel (mean ± SD score 3.22 ± 0.85).

Cluster IV includes 4 elements (maximum pain within 4–12 hours, marked impairment of physical function, reduced range of motion of affected joint, and pain prevents walking) and was viewed by the panel as a reflection of flare severity. In terms of being necessary for defining gout flare, the average rating for the group of elements within this cluster was ranked second by the Barcelona panel (mean ± SD score 2.60 ± 0.97) and third by the Delphi respondents (mean ± SD score 3.00 ± 1.01). An examination of the hierarchical dendrogram indicated that clusters III and IV also formed a higher order cluster with elements from cluster III situated toward the general and subjective ends of dimensions I and II, respectively.

The 4 elements comprising the membership of cluster V (swelling of the affected joint, marked tenderness of the affected joint, redness of the affected joint, and warmth of skin overlying the affected joint) collectively were interpreted by the panel as joint inflammation. The mean aggregated ratings for elements within this cluster was ranked first by both the Delphi respondents (mean ± SD score 2.09 ± 0.72) and the Barcelona panel (mean ± SD score 2.91 ± 0.57). The clustering sequence as revealed by the dendrogram indicated that clusters I, II, III, and IV formed one large cluster prior to fusion with cluster V.

Using the results of the cognitive mapping as an explicit framework for discussion, the panel formulated a final list of items that could be tested as useful elements of a gout flare (Table 5). The Barcelona panel considered each cluster of elements in turn, and through a process of open discussion and consensus the panel selected critical elements within each cluster. Elements that could be reported by patients, by clinical examination, and by laboratory tests were considered. It was not determined which combination of these features was the best representation of gout flare, nor whether a flare could be defined solely by patient self-report. Such decisions were considered best suited to empirical testing in a validation study.

Table 5. Final list of elements for a definition of gout flare*
ElementMethod of measurement
  • *

    VAS = visual analog scale; HAQ = Health Assessment Questionnaire.

Swollen joint(s)Patient self-report
Tender joint(s)Patient self-report
Warm joint(s)Patient self-report
Patient self-report of pain10-cm VAS
Patient self-report global assessment10-cm VAS
Time to maximum pain levelUncertain
Time to complete resolution of painUncertain
Functional statusHAQ, single item 10-cm VAS,  or Likert scale
Acute-phase markerC-reactive protein


Using formal consensus methodology, these studies have identified a short list of potential elements of gout flare that now need to be tested empirically against a gold standard in the context of an observational cohort study. The strengths of our approach are that multiple sources of information were used to identify possible flare elements, and that statistical modeling was used to reduce items into manageable and meaningful units. In addition, the perspective of patients with gout was also obtained and shown to agree very closely with the perspective of the expert health professional.

The extent to which episodes of musculoskeletal pain reported as flare by prior RCTs actually correspond to true flares is not clear. Some nonrandomized or observational studies have also left flares undefined (20–22), but others have used definitions such as joint symptoms that led to emergency room or urgent outpatient clinic evaluation (23), or office/emergency room visit with gout and/or joint pain administration code(s) and at least one typical gout treatment within 7 days of the visit (24).

Intervention studies for acute gout have had to define flare in terms of entry criteria for the study rather than as an outcome. The definitions used in these studies have included meeting the 1977 ACR criteria for acute gouty arthritis (25) with a symptom score of ≥5 (composite of pain, tenderness, swelling) and a pain score of ≥2 (0–4 Likert scale) (1, 8), and meeting the 1977 ACR criteria with an inflammatory score ≥5 (composite of functional impairment, tenderness, swelling of the index joint) and pain rated at least moderate (6). Thus, such definitions have the potential to conflate presence of flare with the severity of the flare. While this may be appropriate for studies of acute gout, it is optimal to be able to separate out flare presence from flare severity when using flare as an indicator of outcome in studies of chronic gout.

There may be problems with operationalizing some of the elements identified by this study for use in clinical trials. In particular, some items reflect evolution of symptoms (e.g., time to maximum pain, time to complete resolution), which would likely require a degree of retrospective assessment or assessment over multiple time points, and may not always be practical in clinical trials. Nonetheless, such features may be very important in distinguishing gout flares from other causes of acute musculoskeletal pain. Other elements may be difficult to ascertain retrospectively in the context of determining the number of previous flares (i.e., acute-phase marker). Nonetheless, these consensus exercises have provided a firm direction toward what should and should not be included, as well as how the information might be collected.

The next phase of this project is to confirm how these elements should be operationalized and to prospectively collect the information in consecutive patients with gout, during acute flare, intercritical periods, and nongout encounters, in order to determine the accuracy of these features and their combinations. This work is currently in progress and will be reported separately.


Dr. Taylor had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study design. Taylor, Shewchuk, Saag, Schumacher, Singh, Grainger, Waltrip, Simon.

Acquisition of data. Taylor, Shewchuk, Saag, Schumacher, Singh, Grainger, Edwards, Waltrip, Burgos-Vargas.

Analysis and interpretation of data. Taylor, Shewchuk, Saag, Schumacher, Singh, Grainger, Edwards, Bardin, Waltrip, Simon.

Manuscript preparation. Taylor, Shewchuk, Saag, Schumacher, Singh, Grainger, Edwards, Waltrip, Simon, Burgos-Vargas.

Statistical analysis. Shewchuk, Grainger, Waltrip.


The participants in the first Delphi exercise were Andre Barkhuizen, Anthony Gear, Ariella Kelman, Arthur Kavanaugh, C. Kent Kwoh, Daniel Ching, Daniel Clegg, Daniel Furst, Dinesh Khanna, Eswar Krishnan, Fernando Perez-Ruis, Gail Kerr, George Nuki, H. Ralph Schumacher, Jasvinder Singh, John Sundy, Julie Yu, Kenneth Saag, Lan Chen, Larry Edwards, Lee Simon, Lisa Stamp, Marina Sew Hoy, Michael Becker, Michael Weisman, Nancy Joseph-Ridge, Naomi Schlesinger, Nicola Dalbeth, Peter Chapman, Peter Jones, Thomas Bardin, Umbreen Hasan, and Worawit Louthrenoo.

The participants in the second Delphi exercise were Alex So, Annelies Boonen, Ariella Kelman, Burton Abrams, Tim L. Jansen, Eliseo Pascual, Eswar Krishnan, Fernando Perez-Ruiz, Fiona McQueen, Hisashi Yamanaka, Jose Alvarez-Nemegyei, Ken Saag, Lan Chen, Michael Becker, Nicola Dalbeth, Pascal Richette, Pietro Melloni, Siddharth Kumar Das, H. Ralph Schumacher, Rebecca Grainger, Richard Brook, Royce W. Waltrip, Sergio Kowalski, Sjef van der Linden, Tracy Frech, and Tuhina Neogi.