Reference values of spirometry for Finnish adults

Diagnostic assessment of lung function necessitates up‐to‐date reference values. The aim of this study was to estimate reference values for spirometry for the Finnish population between 18 and 80 years and to compare them with the existing Finnish, European and the recently published global GLI2012 reference values.


Introduction
Spirometry is the most commonly used method for assessment of lung function. Up-to-date reference values, which reflect the target population, are necessary. In an optimal situation, reference values for diagnostic purposes are estimated from randomly selected healthy subjects without any exposure to inhaled substances known to potentially affect lung function. Lung function depends on ethnicity due to differences in body composition and the proportional size of thoracic cavity in relation to height.
In Finland, the presently used reference values were derived from a selected occupational cohort of 296 males and 257 females with a relatively narrow age range between 18 and 65 years in the late 1970s; in the age group from 60 to 65 years, only 9 men and 13 women were included (Viljanen et al., 1982). Thus, these reference values are not appropriate for elderly subjects and may not reflect a current healthy population due to changes in health-related factors in Finland. From 1970s, the average height of the Finnish population has increased on average almost 5 cm in both men and women, a change reflecting generally improved level of health and nutrition during childhood and adolescence (Silventoinen et al., 1999). In addition, the reference values of Viljanen (Viljanen et al., 1982) were compiled using a rolling-seal spirometer coupled to an analog x-y-writer. This system has not been used for over two decades in clinical work. The performance of modern flow transducer devices with digital data processing differ in comparison with the system used by Viljanen in terms of internal resistance, friction, time constants and accuracy.
In 2012, Quanjer et al. (2012) presented new Global Lungs Initiative (GLI2012) reference equations using the LMS method. These all-age reference values provide a statistically more valid description of the evolution of lung function from childhood to adulthood (Cole & Green, 1992;Stanojevic et al., 2008;Cole et al., 2009). Different racial groups were included, but the majority of included studies were grouped to 'Caucasian' despite a very large geographical variation. No data from Finnish people were included in that material. Predicted values for lung volumes according to the ECSC are up to 10% smaller than the current Finnish reference values (Viljanen et al., 1982;Quanjer et al., 1993). In Australia and Poland, the GLI2012 values have recently been evaluated in hospital tertiary care patients undergoing routine spirometry. The GLI2012 prediction models were found to produce significantly larger lung volumes than the old ECSC reference values (Quanjer et al., 2013).
In healthy subjects from across Australia and New Zealand, the GLI2012 reference values for Caucasians have been found to reflect current values adequately although some statistically significant differences were found (Hall et al., 2012). The reported z scores of around +0Á3 (SD 1Á0) reflect slight underestimation of lung volumes with GLI2012. In Tunis, a Northern African country which is also categorized into the same Caucasian ethnic group as Finland, the GLI2012 reference values have recently been found to significantly overestimate lung volumes in healthy non-smokers with z scores in the magnitude of À0Á6 (SD 0Á9) (Ben Saad et al., 2013).
The aim of this study was to produce for clinical use new reference values of flow-volume spirometry for native Finns from healthy non-smoking adults with a wide age range recorded with modern flow transducers and compiled with the most recent statistical modelling. We also aimed to assess differences between the new reference data, existing Finnish reference values, other European reference values and the new global GLI2012 reference values.

Methods
Study subjects were recruited from four locations (Helsinki, Kuopio, Tampere and Kemi) in Finland representing geographically diverse populations. Thus, the subjects covered people from southern, eastern, middle and northern Finland. All study participants were native Finns. Uniform inclusion and exclusion criteria were applied to include healthy and non-smoking subjects with no morbid obesity, no occupational exposure to vapours, gases, dusts or fumes and accepting only short previous exposure to tobacco smoke. Approvals of the study protocols were obtained from Helsinki University Central Hospital Coordinating Ethics Committee (for Helsinki), Research Ethics Committee for the Hospital District of Northern Savo (for Kuopio and Tampere) and L€ ansi-Pohja Central Hospital Ethics Committee (for Kemi). A written informed consent was obtained from all participants.
In Tampere and Kuopio, healthy non-smokers were specifically recruited for this study between 2005 and 2006 using newspaper announcements. To find also sufficient numbers of elderly subjects, announcements were targeted in Kuopio to newspapers read more frequently by senior citizens. Subjects were interviewed to determine eligibility for this study using predefined criteria, which were selected to correspond to the interview and anthropometric criteria applied to the population samples. From the FinEsS population study (Helsinki and Kemi), healthy non-smoking subjects were selected based on the FinEsS interview questionnaire. The study protocol for the FinEsS studies has been published previously (Kotaniemi et al., 2005;Pallasaho et al., 2006;Kainu et al., 2008). Of the 643 participants of the FinEsS-Helsinki population study undertaken between 2001 and 2003, 212 subjects were identified on predefined criteria healthy and non-smoking. In Helsinki, a separate repeatability study was conducted with 21 healthy non-smoking volunteers. Of the FinEsS-Kemi population study undertaken between 1996 and 1999, 695 participants yielded 233 healthy non-smoking subjects using identical screening protocol. The formation of the study sample from the different study centres is shown in Fig. 1.
Subjects were required not to have any diagnosed acute or chronic pulmonary disease, cardiac disease or neurological disability, morbid obesity, no prior chest surgery, radiotherapy or anomalies of the thorax, no systemic diseases known to affect respiratory function (e.g. connective tissue disease or muscular dystrophy), no prior use of lung medication (with the exception of temporary use of cough medication) or heart medication potentially affecting respiratory function. Prescribed use of medication was considered an indication of diagnosed disease and thus included in the predefined exclusion criteria. Reported use of medication for systemic hypertension was the only allowed exception, in otherwise healthy and asymptomatic eligible subjects. Subjects with abdominal surgery during the previous 6 months, pregnancy over 20 weeks of gestation or childbirth <3 months prior to the study visit, were excluded from the study sample. Smoking history of <10 pack years was allowed if the smoking cessation had taken place over 10 years previously in order to accommodate older age cohorts, where prior smoking in adolescence was known to be more prevalent. Height was measured without shoes using a calibrated stadiometer to the closest full cm and weight with only light clothing in kilograms with one decimal. The difference of study date and subject's date of birth was used to calculate age truncated to full years used for calculations.
Identical Vmax 22D spirometers (heated wire flow transducer spirometer) were used in each study centre (Sensor Medics Corporation, Yorba Linda, California, USA). Spirometers were calibrated on every day in the morning with a certified 3-L syringe for volume and flow; three different flow levels between 2 and 12 l s À1 were used. All the measurements were taken originally following the 1994 ATS Standard (American Thoracic Society, 1995). A minimum of three successive acceptable spirograms were recorded. All data included are taken from measurements before inhalation of eventual bronchodilator. During the study, the updated ERS/ATS standards were issued, which slightly changed, for example the repeatability criteria (Miller et al., 2005). In the ATS 1994 standard, both forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV1) were acceptable, if the two best values differed no more than 200 ml (American Thoracic Society, 1995). The ATS/ERS criteria reduced this allowed variation to 150 ml in both volumes (Miller et al., 2005). Of the total of 1127 subjects eligible for the study sample based on questionnaire screening, all spirometries were reviewed for technical acceptability. Subjects not fulfilling the tighter repeatability criteria were excluded from the study sample. The final sample was 1000 subjects, 613 women and 387 men. Table 1 shows descriptive characteristics of anthropometric measurements of the study sample. The age and height distribution of the study participants is shown in Table 2. A total of 149 subjects (14Á9%) were classified as eversmokers having smoked between 0Á1 and 9Á5 pack years with an average of 3Á6 (SD 3Á3) pack years and 28Á4 (SD 10Á3) years from quitting smoking.
Flow-volume spirometry data were extracted electronically and back verified to the printed copies of spirometry reports.
The largest values of FVC, forced expiratory volume in 6 s (FEV6) and FEV1 were taken, and their ratios (FEV1/FVC and FEV1/FEV6) were calculated using these largest values. All other flow values were taken from the curve with largest sum of FEV1 + FVC. Data on FEV6 were only available for measurements taken in Helsinki and Kuopio. The forced expiratory flow-volume variables analysed include FVC, FEV6, FEV1, FEV1/FVC, FEV1/FEV6, maximum mid-expiratory flow (MMEF), maximum expiratory flow at 75%, 50% and 25% of FVC remaining (MEF75, MEF50, MEF25) and peak expiratory flow (PEF).
A separate repeatability study was conducted in Helsinki with 21 healthy volunteers, 6 men and 15 women. The participants were on average 40Á3 (SD 9Á9) years old. Spirometry was repeated on two consecutive days on the same time of the day (AE1 h) and undertaken by the same study nurse. Change in best FVC was on average 0Á007 (SD 0Á187) litres or 0Á1% (SD 3Á6%) from baseline. Change in best FEV1 was on average 0Á013 (SD 0Á098) litres or 0Á5% (SD 3Á9%) from baseline. FEV1/FVC changed in average 0Á003 (SD 0Á017) or 0Á4% (SD 2Á1%) relative to baseline. Only data from the spirometry completed on the first study day were included in the reference values study.

Statistical methods
All statistical analyses were performed with the R program (version 2.15.1, http://cran.r-project.org) and using the gamlss package (Rigby & Stasinopoulos, 2005;R Core Team, 2012). We applied the generalized additive model for location, scale and shape (GAMLSS) used also in the modelling of the Global Lungs Initiative (GLI2012) (Quanjer et al., 2012). The original LMS method summarizes the changing distribution by three curves representing the median l (M), coefficient of variation r (S) and skewness, the latter expressed as after a Box-Cox power k (L) transformation. The LMS method was developed by Cole and originally described by Cole & Green, (1992) (Cole et al., 2009). In our study material, the normal distribution (NO()) resulted in smaller prediction errors and the most parsimonious model in terms of the Schwarz Bayesian Criterion (SBC) than Box-Cox Cole and Green (BCCG) distribution. Adding a moment for skewness (L) did not significantly improve the fit of the model in any of the variables. Age and height were found to be the main determinants in all evaluated lung function parameters. To allow a flexible model for the relationship between age and the different spirometric variables, we used penalized b-splines for age in both the mean and the standard deviation model. With the normal distribution gamlss model (NO()), the moment for coefficient of variation is the standard deviation (Stasinopoulos et al., 2012). Weight and BMI were assessed separately, but neither provided any additional degree of explanation to the models.

non-smokers
We used the following prediction equations for the mean (M) and standard deviation (S) of the lung function variables: where a 0 , a 1 , a 2 and b 0 , b 1 are the regression coefficients estimated from the sample. The values for spline functions MSpline i and SSpline i , for each predicted value, are presented in the lookup table in Annex S1. The predicted mean value (M pred ) of each lung function variable can also be expressed as: The lower limits of normal (LLN) (5th percentile) were obtained from formula: Z score for each individual measurement (z i ) can be calculated: between study locations were assessed using a dummy variable for study location and assessing differences

Men (n = 387)
Women (n = 613) BMI, body mass index; FVC, forced vital capacity; FEVt, forced expiratory volume in t seconds, PEF, peak expiratory flow, MMEF, maximum midexpiratory flow; MEFx, maximum expiratory flow at x% of FVC remaining. between predictions from different locations. Similarly, a sensitivity analysis was conducted using dichotomous smoking (never and ever) and obesity (BMI > 30) as dummy variables in a full gamlss model. No statistically significant differences were found in the predicted values in terms of smoking or obesity when sex, age and height of the subjects were taken into consideration and when using a dummy variable for study location. Obese subjects were excluded from final models. Bland-Altman plots for difference between predicted FVC and FEV1 from this study and those predicted from GLI2012 reference values were plotted as described by Bland and Altman (Bland & Altman, 1986).

Results
The estimated coefficients for the regression equations using the LMS method are listed separately for each lung function variable in Table 3. The mean predicted values and their respective LLN for FEV1, FEV6, FVC and FEV1/FVC and FEV1/FEV6 ratios are plotted in Fig. 2 for men and women using arithmetic mean of height of the sample. Bland-Altman plots for difference between predicted FVC and FEV1 from this data set and the Global Lungs Initiative 2012 (GLI2012) reference values in Fig. 3 show that the absolute difference in predicted values is greater in smaller predicted lung volumes and decreases with increasing predicted lung volume.
Model predicted mean values from this study compared to Viljanen reference values (Viljanen et al., 1982) for MEF50 and compared to the GLI2012 reference values (Quanjer et al., 2012) for MMEF with the respective LLN values are shown in Fig. 4  The difference between measured and predicted values calculated from this data set was in average FEV1 0Á064 (SD 0Á101) litres, FVC À0Á014 (0Á546) litres and FEV1/FVC 0Á11% (5Á24%). With GLI2012 reference values, the differences were 0Á099 (0Á442) litres, 0Á205 (0Á557) litres and À1Á80% (5Á34%), respectively. The difference in predicted mean and LLN values between the present study and GLI2012 reference values in Fig. 5 demonstrates that the GLI2012 reference values slightly underestimate both mean FEV1 and FVC in men and women with an age-dependent increasing trend. However, the GLI2012 reference values give better agreement in lung volumes than the ECSC reference values, which underestimate the FVC and FEV1 by around 0Á4 l (Table 4). Particularly in women, the ECSC reference values underestimate FVC in this data set on average by 527 ml or 17%. Viljanen reference values on average performed better than the GLI2012 in both FVC and FEV1 in the applicable age range of 18-65 years. Of other comparable reference values, the prediction equations from Switzerland (Br€ andli et al., 1996) were closest to the observed distribution, whereas values from Table 3 Coefficients for predictive equations. Formula for mean (M) = exp(a 0 + a 1* ln(height) + a 2* ln(age) + MSpline) and standard deviation (S) = exp (b 0 + b 1* ln(age) + Sspline). For spline contributions to each model, see respective lookup tables a .  , 1985, 1986) were very close to the new Finnish values in their age range (Table 4). The mean z score of FVC in men was 0Á37 (SD 1Á00) when calculated according the GLI2012 predictions and À0Á03 (SD 1Á00) according the predictions of the present study. For FEV1/FVC, the values were À0Á23 (0Á80) and 0Á02 (1Á00), respectively. The predicted model from the present study compared to the GLI2012 reference values (Quanjer et al., 2012) for FVC, FEV1 and FEV1/FVC in relation to different age categories are presented in Fig. 6. Although the mean difference between measured and predicted GLI2012 reference value was small, the figure shows a slight trend of increasing underestimation of FEV1 by age when using GLI2012 predictions. Data calculated according to our new predicted model compared to those by GLI2012 reference model in different height categories are shown in Fig. 7; the underestimation of lung volumes is greater in shorter subjects and decreases with increasing height.

Discussion
The need for locally representative reference values is of utmost importance. The old Finnish reference values by Viljanen (Viljanen et al., 1982) are limited by the relatively narrow age scale of study subjects, potential cohort effect,   substantial change in the technical equipment and improvement in statistical methodologies over the past 40 years. In this study, we evaluated 1000 healthy non-smoking native Finnish subjects for lung function and found that the recently published GLI2012 reference values on average provided with reasonably good fit, but underestimated lung volumes, especially FVC, with the degree of underestimation gradually increasing with advancing age. Predicted mean and lower limit of normal (LLN) of (a and b) maximum expiratory flow at 50% forced vital capacity remaining (MEF50) from this study and from Viljanen reference model (Viljanen et al., 1982), and (c and d) maximum mid-expiratory flow (MMEF) from this study and from GLI2012 model (Quanjer et al., 2012). Graphs a and c present values for average height men (177 cm), and b and d for average height women (164 cm), respectively.
GLI2012 reference values are clearly more representative of current target population than the ECSC reference values in lung volumes, but the FEV1/FVC ratio was better predicted using the ECSC due to the fact that the magnitude of underestimation of FEV1 and FVC was similar in the ECSC predictions. FEV1 by 3% with GLI2012 can be regarded as significant given the role of FEV1 in interpretation of spirometry in general. FEV1 is the main variable used for clinical decision-making and also grading of ventilatory defects in spirometry according to the ATS/ERS standard (Pellegrino et al., 2005). Even more importantly, when the two prediction models are compared on actual reference subjects, the GLI2012 reference model shows an age-dependent trend of increasing underestimation of FEV1 and particularly FVC especially in men. In FEV1/FVC this results in an overestimation that increases with age, which is quite unfortunate given the important role of FEV1/FVC reference value in the diagnosis of airflow limitation. The SAPALDIA study is one of the studies included in the GLI models, which showed one of the best concordances, but which is also limited to the age range 18-60 years. The number of Nordic subjects was very small in the GLI data set. The observed difference between native Finns and the GLI2012 reference values poses several questions and avenues for further study. The GLI2012 model provided better agreement with younger and taller adults, but predicted smaller lung volumes with an age-dependent trend. The older adults, women in particular, were significantly shorter and also their height varied within the age category less, that is the older subjects were more homogenous with respect to size.
The proportion of male subjects in the study was relatively small (39%). This was partially due to relatively more frequent disqualification due to reported smoking exposure and respiratory symptoms. Also, younger men were less likely to participate in the study (Kotaniemi et al., 2001). The absolute numbers and age distribution of the male subjects was judged to be adequate for the current analysis. In addition, to exclude  Figure 6 Comparison of z scores of predicted a) forced vital capacity (FVC), b) forced expiratory volume in 1 s (FEV1) and c) their ratio FEV1/FVC predicted from the present study and the GLI2012 (Quanjer et al., 2012) reference models in different age categories for men and women in the reference data set.
the potential selection bias introduced by selection criteria, each spirometry variable was separately assessed for differences between study locations using a dummy variable for study location in the GAMLSS model. No statistically significant differences were found when gender, age and height of the subjects were taken into consideration. As the study centre in Kuopio recruited more older subjects than the other centres, all subjects over 60 years of age were excluded in a further effort to rule out selection bias affecting the found age-related increase in model discordance with the GLI2012 prediction model. The sampling differences between the study centres did not seem to significantly affect the prediction models.
To get more elderly subjects for the study, the newspaper announcements to recruit volunteers were targeted in Kuopio to papers frequently read by senior citizens. In the older age cohorts, it was recognized that short smoking exposure during, for example, war years by older men could make the recruitment of healthy older subjects unnecessarily difficult. Thus, it was decided a priori that <10 pack years of smoking was allowed, if smoking cessation took place at least 10 years previously. The smoking criteria were implemented stringently, thus also excluding a larger number of young men that had more often casual or short-term smoking, but with smoking cessation invariably <10 years previously.
The effect of including some eversmokers to the model was assessed in a separate sensitivity analysis which showed that the results obtained from eversmokers (n = 149) in this study did not differ from neversmokers with the exception of eversmokers being significantly older, given the required 10-year interval from smoking cessation. The majority of eversmokers had smoked in their teenage years. The mean age of quitting smoking was 19 years with an average time 28Á4 years from quitting smoking. This was the result of predefined criteria and not actually anticipated, but the exclusion of these subjects would have resulted in an additional selection bias. Careful sensitivity analyses were conducted to rule out selection bias on grounds of smoking exposure, obesity and the different selection procedures in the four locations. The selection criteria from the population studies were predefined and the subjects from Kuopio and Tampere recruited using similar criteria. During the study, new simplified criteria were suggested in the literature for selection of subjects for lung function reference values (Johannessen et al., 2007). As our criteria had been predefined for subject selection in the targeted recruitment, these new criteria could not be applied in this study. However, we find these criteria fairly similar.
Technical differences between the rolling-seal spirometer used by Viljanen and the pneumotachograph sensors used in our measurements and small number of subjects over 60 years of age in the Viljanen study offer an explanation to the observed overestimation of MEF50 in the old Finnish reference values (Viljanen et al., 1982). For MMEF, the new GLI2012 values were very similar to our new values despite larger inherent variability of the instantaneous flow values. The instantaneous flow values provide additional information to complement main indexes and if used, should always be derived from the same data set and manoeuvres as the main variables.
GLI2012 reference values present one set of reference values for all Caucasians broadly defined as all European decent origins. Significant genetic differences exist between different European populations and from studies among twins for FEV1 additive genetic effects of 61% have been observed (Jakkula et al., 2008;Ingebrigtsen et al., 2011). Comparative studies on healthy subjects have not been undertaken, but it has been hypothesized that the northern populations would have slightly larger lung volumes. In Tunis, which is categorized into the same 'Caucasian' ethnic group with Finland, the GLI2012 values have been shown to overestimate lung volumes, with mean z scores of À0Á62 (SD 0Á86) for FVC and À0Á55 (SD 0Á87) for FEV1 (Ben Saad et al., 2013). In our study, the same GLI2012 prediction model underestimated FVC with mean z score of +0Á37 (SD 1Á00) and FEV1 + 0Á25 (SD 1Á04). In our study, the difference with the GLI2012 prediction was largest in the older age groups and in subjects with short stature. The new Finnish reference values present the current level of lung function for Finnish adults 18-80 years of age.

Conclusions
The present new reference values offer a closer prediction of spirometric values for native Finns than any other published predictions, including the GLI2012 reference values, which underestimated lung volumes with an age-dependent trend. Therefore, we recommend these new Finnish reference values of flow-volume spirometry for clinical use among native Finns.