Questionnaire layout and wording influence prevalence and risk estimates of respiratory symptoms in a population cohort

Objective Results of epidemiological studies are greatly influenced by the chosen methodology. The study aims to investigate how two frequently used questionnaires (Qs), with partly different layout, influence the prevalence of respiratory symptoms. Study Design and Setting A booklet containing two Qs, the Global Allergy and Asthma European Network Q and the Obstructive Lung Disease in Northern Sweden Q, was mailed to 30 000 subjects aged 16–75 years in West Sweden; 62% responded. Sixteen questions were included in the analysis: seven identical between the Qs, four different in set-up and five with the same layout but different wording. Comparisons were made using differences in proportions, observed agreement and Kappa statistics. Results Identical questions yielded similar prevalences with high observed agreement and kappa values. Questions with different set-up or differences in wording resulted in significantly different prevalences with lower observed agreement and kappa values. In general, the use of follow-up questions, excluding subjects answering no to the initial question, resulted in 2.9–6.7% units lower prevalence. Conclusion The question set-up has great influences on epidemiological results, and specifically questions that are set up to be excluded based on a previous no answer leads to lower prevalence compared with detached questions. Therefore, Q layout and exact wording of questions has to be carefully considered when comparing studies. Please cite this paper as: Ekerljung L, Rönmark E, Lötvall J, Wennergren G, Torén K and Lundbäck B. Questionnaire layout and wording influence prevalence and risk estimates of respiratory symptoms in a population cohort. Clin Respir J 2013; 7: 53–63.


Introduction
Postal enquiries are among the most efficient tools when assessing prevalence and risk factors of asthma and respiratory symptoms (1,2). The prevalence of asthma has increased over the last 50-60 years and is estimated to be 7%-10% in different parts of the Western world (3)(4)(5)(6). Data on incidence vary partly depending on different definitions of asthma and population at risk. Using similar methods, the incidence is approximately 2/1000/year in Northern Europe (7)(8)(9)(10). When self-reported asthma by questionnaires (Qs) is validated against clinically relevant asthma, it has a high specificity and a fair, or good, sensitivity in countries with developed health-care systems (1,2). When comparing results from epidemiological studies, it is important to take the methods and definitions used into consideration, as results are influenced by the methodology. Two main models have been used in the validation of epidemiological diagnosis of asthma: a provocation test or a clinical interview (1,2,11), or a combination of both methods (12).
Today, there are few Qs that are widely used. Among adults, the European Community Respiratory Health Survey Q (ECRHS-Q) (13) and the subsequent Global Allergy and Asthma European Network Q (GA 2 LEN-Q) are commonly used. Both fail to cover bronchitis or chronic obstructive pulmonary disease (COPD) in a satisfactory way. However, the Obstructive Lung Disease in Northern Sweden Q (OLIN-Q) (14) cover these aspects and has frequently been used in Sweden and northern European countries.
In 2008, a study focusing mainly on asthma was initiated in West Sweden. The initial step was a postal survey using two respiratory Qs, the GA 2 LEN-Q and the OLIN-Q, with the primary aim of updating the prevalence of asthma, respiratory symptoms and allergy (15). The aim of the present study was to investigate how these two frequently used Qs, with partly different question structure and wording, influence the prevalence of respiratory symptoms and other outcomes.

Study area and population
The study was initiated in 2008 when 30 000 randomly selected subjects aged 16-75 years of age received a postal Q. The study was performed in the region of West Gothia in Western Sweden, including the city of Gothenburg. The study population was selected using the Swedish Population Register and was stratified by age and sex to mirror the population in West Gothia.
Study design, results of prevalence and effects of late response and nonresponse have previously been published (15,16).

Qs
The study consisted of a booklet containing the OLIN-Q followed by the GA 2 LEN-Q. The OLIN-Q has been used in many studies in the Nordic and the Baltic countries, prominently the FinEsS (Finland, Estonia, Sweden) studies, comparative studies of airway diseases (6,17,18). It was developed from the British Medical Research Council Q (BMRC-Q). The OLIN-Q contains questions on asthma, rhinitis, chronic bronchitis/COPD/emphysema, respiratory symptoms, use of asthma medication and possible determinants of disease, such as smoking habit, occupation and family history of disease. The OLIN-Q and variants of it (19) have been validated against physiological variables including bronchial hyperresponsiveness (12,20). To this Q, detailed questions about occupation, occupational exposure, socio-economic conditions and health status were added. The Swedish version of the GA 2 LEN-Q is a variant of the ECRHS-Q (13,21) with additional questions concerning mainly rhinitis, chronic sinusitis and eczema. Questions on rhinitis and sinusitis in the GA 2 LEN-Q originate from the Allergic Rhinitis and its Impact on Asthma initiative (21).

Definitions
In this comparative study, 16 questions from the two Qs were analyzed. The questions were categorized into three groups based on similarity between the Qs: Group I -identical between the Qs; Group II -same question layout but not identical symptom or condition asked for; and Group III -similar wording but different layout. The questions and differences between the Qs have been summarized in Table 1. The questions belonging to group III were follow-up questions in one of the Qs, excluding subjects who did not respond to a qualifying question, but single questions in the other. Use of asthma medication and attacks of shortness of breath were follow-up questions in the GA 2 LEN-Q, while productive cough was a follow-up question in the OLIN-Q. The qualifying question for use of asthma medication and attacks of shortness of breath was 'Have you ever had asthma' . For productive cough the qualifying question was 'Do you usually have phlegm when coughing, or do you have phlegm in your chest, which is difficult to bring up' . Smoking was a combination of two questions in the OLIN-Q but consisted of only one question in the GA 2 LEN-Q. Group II -same question set-up but not identical symptom or condition asked for Rhinitis Have you now or have you ever had allergic rhinitis (hay fever) or allergic eye catarrh.
Do you have any nasal allergies including hay fever?
Physician-diagnosed COPD Have you been diagnosed as having chronic bronchitis, COPD or emphysema by a doctor?
Have you been diagnosed as having chronic obstructive pulmonary disease (COPD) by a doctor?
Nasal blockage Do you have blocking of your nose more or less permanently?
Has your nose been blocked for more than 12 weeks during the last 12 months?

Rhinorrea
Do you have a runny nose more or less permanently?
Have you had discoloured nasal discharge (snot) or discoloured mucus in the throat for more than 12 weeks during the last 12 months?
Exposed at work Have you been heavily exposed to dust, gases or fumes at your work?
Have you ever held a job where you were exposed to gases, fumes or dust?
Group III -similar wording but different set-up Productive cough Follow-up question: Do you bring up phlegm on most days during periods of at least three months?

Do you bring up phlegm from your chest on most days for as much as three months each year?
Ever smoker Do you smoke? or Have you previously smoked?
Have you ever smoked for as long as a year?
Asthma medication Do you currently use asthma medication (permanently or as needed)?

Follow-up question: Are you currently taking any asthma medication (including inhalers, aerosols or tablets) for asthma?
Attacks of shortness of breath Have you now or have you had asthma symptoms during the last 12 months (intermittent breathlessness or attacks of shortness of breath, the symptoms may exist simultaneously with or without cough or wheezing)?

Follow-up question:
Have you had an attack of asthma in the last 12 months?
Differences have been highlighted using bold text.

Statistical analyses
Statistical analyses were performed using the Statistical Package for the Social Sciences 16.0 (SPSS, Inc., Chicago, IL, USA). Comparisons were made using differences in proportions, observed agreement (OA) and the kappa coefficient. The kappa coefficient compares the level of agreement between different groups of data (22) and was interpreted using the following definitions: below 0.2, slight or poor agreement; 0.21-0.4, fair agreement; 0.41-0.6, moderate agreement; 0.61-0.8, substantial agreement; and 0.81-1, almost-perfect agreement (23). OA measures the proportion of identical answers from the two Qs. The significance of the kappa coefficient and differences in proportions were determined by the 95% confidence interval (95% CI). Exposed to gas, dust or fumes at work, and smoking from each of the Qs were used as an independent variable in logistic regression analyses of questions from their respective Q to obtain relative risk estimates.
With the exception of 'ever smoking' , all questions with a different layout between the Qs (group III) yielded significantly different results (Fig. 1C). A layout where the question was a resulting question of a previous answer (excluding subjects who had answered no to a qualifying question) yielded lower prevalence compared with a single question. Prevalent 'use of asthma medication' was 8.7% (95% CI 8.3-9.1) according to the OLIN-Q vs 5.5% (95% CI 5.2-5.9) in the GA 2 LEN-Q and attacks of shortness of breath 9.6% (95% CI 9.2-10.0) vs 3.0% (95% CI 2.7-3.2).
The differences in prevalence were similar in both men and women irrespective of wording and layout ( Table 2). The questions had response rates ranging from 88.2% to 99.5%; all but one had response rates above 95%.
Two of the investigated questions concerned exposure to potential risk factors. One of these, 'ever smoking' , had a prevalence of 40.1% in the OLIN-Q vs 42.0% in the GA 2 LEN-Q (Fig. 1C), while 'exposed to gas dust or fumes at work' was reported by 22.2% vs 36.4% (Fig. 1B).
The proportion of identical answers from corresponding questions in the two Qs was in general very high, with OA above 0.92 (Fig. 2B). Only 'rhinorrhea' (OA 0.85) and 'exposed to gas dust or fumes at work' (OA 0.82) had a somewhat lower proportion of identical answers. There were no differences in reliability between different subpopulations, such as high vs low education, non smoking vs smoking and men vs women.
A risk-factor analysis revealed no differences regarding relative risk estimates for 'any wheeze last 12 months' using the two Qs for either of the investigated independent variables 'exposed to gas dust or fumes at work' (Fig. 3A) or 'smoking' (Fig. 3B). Both 'exposed to gas dust or fumes at work' and 'smoking' were stronger risk factors in the OLIN-Q for 'attacks of shortness of breath' , while 'exposed to gas dust or fumes at work' was a stronger risk factor also for 'productive cough' in the OLIN-Q. There were no differences between the two Qs in odds ratios for 'smoking' as a risk factor for 'productive cough' .

Discussion
This study compares the GA 2 LEN-Q, which can be regarded as a variant of the ECRHS-Q (13), with the OLIN-Q, a Q used mainly used Northern European countries (15,18,20,24). Similar estimates of prevalence of symptoms and diseases were found with a few important exceptions. OA was above 0.9, and kappa values indicated substantial or almost-perfect agreement in most cases. In general, a slightly lower prevalence was found in the GA 2 LEN-Q compared with the OLIN-Q. Risk estimates were dependent on the prevalence of the independent variables and generally higher with the OLIN-Q as a result of wording and the design of questions about exposure. Questions about nasal symptoms were more detailed in the GA 2 LEN-Q, while questions about bronchitis were more detailed in the OLIN-Q. Questions regarding symptoms common in asthma were similar or identical in the two Qs.  Nonresponse is an issue for all epidemiological studies as it might introduce bias. A nonresponse study has been performed on the study sample, and nonresponders were more likely to be male, younger, living in Gothenburg and smokers. However, this did not influence the prevalence or risk estimates (16). The large study sample, representative of the general population in the study area, ensures the validity of the study.
Results from Qs are dependent on several factors. Self-administered Qs results in higher prevalences than structured interviews (25)(26)(27); translations create variability (28,29), particularly the translation of 'wheeze' and responses to self-administered Qs before and after a physical demonstration of asthma symptoms results in divergent results with kappa statistics below 0.4 (30). The agreement may also vary with smoking habits and educational level (28). In two studies, the kappa statistics has decreased with increasing educational (28) and social-economical (30) status. However, in the Norwegian study (28), the agreement increased with increasing educational level. An overview of Q comparisons can be found in Table 3.
The two Qs compared in this study have slightly different foci, which influence how the data can be analyzed, and have an impact on prevalence and relative risk estimates. The GA 2 LEN-Q provides more information about rhinitis and eczema compared with the OLIN-Q. The OLIN-Q provides a more thorough description of bronchitis symptoms. It also detects asthma-like symptoms not only among subjects with asthma but in the general population. The GA 2 LEN-Q excludes all nonasthmatics to some questions as it includes qualifying questions and will therefore only report the prevalence of use of asthma medication and attacks of shortness of breath among asthmatics and cannot give an estimate of prevalence in the general population. The slight difference in target population for some questions makes comparison of prevalence estimates more difficult. To detect accurate prevalence of symptoms common in asthma, we suggest that all questions should be answered by all participants.
In line with previous comparisons of Qs regarding prevalence (25,31,32), identical questions and wordings yielded similar estimates of prevalence in our study. Because of the high power of the study comprising of 18 087 participants, several questions resulted in statistically significant differences despite small differences in prevalence, differences that are likely to be of limited clinical relevance. For questions with a similar layout but containing different conditions, the prevalences differed to a higher degree between Qs. Despite covering the same symptom category, significant differences in prevalence have been previously observed even for questions that appear similarly worded (28).
Despite differences in prevalence outcomes, the OA for all questions was very high, including questions with different wording and different layout. The OA in our study was greater than in many previous validation studies, including the National Heart and Lung Institute Q vs the BMRC-Q, and the International Union Against Tuberculosis and Lung Diseases Q (IUATLD-Q) against the BMRC-Q (31).
The kappa values in our study for identical questions all showed almost-perfect (>0.8) or substantial (0.6-0.8) agreement. Even though these kappa values varied from 0.64 to 0.88, we had anticipated even closer levels of agreement. However, the kappa values in the current survey are similar or better compared with studies where the same Q had been distributed twice to the same subjects a few months apart (2), and also when compared with repeatability of the IUATLD-Q (29) and the comparison of the IUATLD-Q vs the BMRC-Q (31).
In order to illustrate how the way a question is asked influence the relative risk estimates in an epidemiological study, risk-factor analyses were performed using questions from different groups. The calculations of relative risk estimates tended to yield higher odds ratios when using the OLIN-Q. If the prevalence is high, and the kappa and OA are satisfactory for the symptom in question, the risk-factor patterns will be similar. However, if the symptoms have been defined differently and hence have different prevalence outcomes, the risk-factor patterns are more divergent. It is known that self-reports are influenced by wording, format and context (33). The precise wording and tempus of a question can thus increase or decrease the probability of a positive response, with more precise questions rendering lower prevalence and questions including words such as 'have you ever' rendering higher prevalence than questions using 'have you now' wording. This phenomenon can be seen in the 'rhinorrhea' and 'nasal blockage' questions, which are more precise in the GA 2 LEN-Q. Differences may also be explained by factors other than wording and layout of the questions. The two Qs together amounted to 74 questions. Although the subjects were not specifically asked to complete the Qs in a specific order, it can be assumed that a vast majority answered the OLIN-Q first as it was placed first in the booklet that contained the Qs. This could have an influence on answers to questions placed further into the booklet. Furthermore, answering questions about symptoms prior to questions about reactions to environmental conditions might also make the subject more aware of their disease and therefore more prone toward a positive response.

Conclusions
Identical questions yielded close to identical results regarding prevalence and had high levels of OAs and kappa values. Both Qs result in similar prevalence, primarily of lower respiratory symptoms. Different wording and different layout had a substantial influence on the estimated prevalence and risk-factor patterns and must be taken into account when comparisons between different studies are performed. An important aspect to consider when epidemiological methods to quantify the prevalence of asthma and symptoms common in asthma are evaluated is to remember that we lack an exact definition of the disease and cannot be certain which method correctly mirrors the truth. The importance of presenting, or referencing, the exact questions in any Q-based survey, not only in the respiratory field, cannot be emphasized strongly enough.