Introduction

Chronic obstructive pulmonary disease (COPD) is a preventable and treatable disease characterised by persistent airflow limitation that is usually progressive and associated with an enhanced chronic inflammatory response in the airways and lungs to noxious particles or gases.1 In developed countries cigarette smoking is the main risk factor. COPD is predominantly diagnosed in adults aged well over 40 years of age.2 It is the fourth leading cause of death, and recent estimates from a global prevalence study indicate that the worldwide population prevalence of COPD ranges from 11% to 26%.3

In most current COPD guidelines the diagnosis of airflow obstruction is based on a fixed 0.70 cut-off point for the postbronchodilator forced expiratory volume in one second (FEV1) over forced vital capacity (FVC) ratio (FEV1/FVC).1,4,5 After having concluded that airflow obstruction is present, its severity is based on comparison with FEV1 values from a reference population which is usually expressed as percentage of predicted FEV1 (FEV1% predicted). With regard to obstruction, the guidelines recommend a classification of severity into four stages ranging from stage 1 (mild obstruction) to stage 4 (very severe obstruction).1,4,5 The degree of obstruction is an important marker to characterise disease severity, together with other markers of disease severity such as exacerbation frequency and level of dyspnoea.1

In order to appropriately categorise patients according to the severity of their obstruction, FEV1 reference values are calculated with equations derived from measurements in a representative sample of healthy subjects from the general population.6 Many different reference equations for calculating predicted FEV1 values are used worldwide. Differences in the evaluation of obstruction and its severity when using different sets of reference equations are well documented.712 These studies show that using ‘outdated’ reference equations may lead to overrating of subjects into more severe disease stages than appropriate. Although experts recommend updating reference equations for lung function parameters regularly,6,13 this advice is not always followed and outdated reference values are widely used in primary and secondary care.

General practitioners (GPs) play a key role in diagnosing, assessing severity, and managing COPD because they see patients during the earlier stages of their disease and because of the continuity of care they offer throughout their patients' lives. The degree of airflow limitation contributes to the classification of disease severity and thus codirects GPs' choice of pharmacological (e.g. short- or long-acting bronchodilator) and non-pharmacological (e.g. pulmonary rehabilitation) treatment options in patients with COPD as recommended in guidelines.1,4,5 Using the most appropriate set of reference equations for FEV1 could diminish misclassification of the severity of airflow obstruction and, consequently, could influence the individual, economic, and societal burden of COPD by giving treatment appropriate to the correct disease severity stage.

To the best of our knowledge, no previous studies have looked at the use of different reference equations to assess severity of obstruction in patients with COPD in primary care settings. The aim of this study was to investigate how switching to more contemporary reference equations would affect the interpretation of spirometry test results when staging the severity of airflow obstruction in the Dutch primary care COPD patient population.

Methods

Study setting and cohort

The study was based on all available spirometry tests from October 2001 to March 2010 from three regional primary care diagnostic centres in the Netherlands (the General Practice Laboratory Foundation Etten-Leur/Breda (SHL-Group), the Diagnostic Centre Eindhoven (D4U), and the General Practice Laboratory East (SHO)). These diagnostic centres offer a range of diagnostic tests (including spirometry) and other healthcare services to hundreds of GPs in the south-western and south-eastern parts of the country. If a suspicion of COPD exists for a particular patient, the GP can refer the patient to the diagnostic centre for spirometry testing. Approximately half of all tests are done for diagnostic purposes; the other half are carried out as part of the regular monitoring of patients with COPD or asthma. Spirometry test results are combined with demographic (gender, age), anthropometric (height, weight), and medical history information (respiratory symptoms, self-reported smoking status and history, medication). All data are recorded using a standardised electronic format and are sent to a respiratory consultant whose assessment of the spirometry test, the diagnostic interpretation of all data and — if applicable — diagnostic advice are sent to the GP together with the actual test results.

Only certified lung function technicians perform the spirometry tests. Each of them performs a minimum of 200 spirometric tests annually and they are regularly supervised in central meetings. Personal computer-based digital volume sensor spirometers (SpiroPerfect®; WelchAllyn, Delft, The Netherlands) are used at all locations and these spirometers satisfy American Thoracic Society (ATS) standards.14 The lung function technicians always use the same spirometer, and all follow a standard operating procedure for calibration of the spirometer on a daily basis. Within-test volume deviations of <3% are considered acceptable. Air temperature and ambient pressure are measured and entered into the spirometric software in order to correct for body temperature and ambient pressure saturation.14 Patient instruction, assessment of acceptability of forced expiratory manoeuvres, and criteria for test reproducibility are based on ATS recommendations.14 Pre- and postbronchodilator measurements are performed with subjects seated at rest before and 15 mins after administration of four doses of 100μg aerosolised salbutamol by Volumatic® spacer (GlaxoSmithKline, Brentford, UK).

Since only routine lung function and respiratory medical history data were used for our analyses and the investigators had no access to the patients' medical records or information on patients' identity, written informed consent was not required.

Subject selection and definition of airflow obstruction

From the initial sample (n=14,056 respiratory symptomatic subjects referred for spirometry by GPs),15 we excluded all subjects aged <40 years (Step 1, n=10,937 remaining). In order to select patients with airflow obstruction that is compatible with COPD, we used the 2013 Global Initiative for Chronic Obstructive Lung Disease (GOLD) guideline criteria,1 which are identical to the criteria currently used in the COPD guidelines for Dutch GPs.5 Therefore, postbronchodilator FEV1/FVC <0.70 values were used to determine whether airflow obstruction was present in the study subjects and non-obstructed subjects were excluded (Step 2, n=3,370 remaining). We only used the first test available for each subject in the database. In order to subdivide subjects into severity groups, we calculated FEV1% predicted values and classified patients according to the GOLD guidelines: mild (stage 1=FEV1 ≥80% predicted), moderate (stage 2=FEV1 50–80% predicted), severe (stage 3=FEV1 30–50% predicted), very severe airflow obstruction (stage 4=FEV1 ≤30% predicted).1,4,5

Reference equation selection for predicted FEV1 values

Three different sets of FEV1 reference equations were used: (1) those that are commonly used in Europe and currently recommended in the COPD guidelines for Dutch GPs5 (i.e. the European Community of Steel and Coal (ECSC) reference equations16); (2) reference equations published by Swanney et al.17 (which were derived from a Dutch general population cohort); and (3) the recently published Global Lung Initiative (GLI) reference equations (which cover ages from childhood to the elderly and take ethnicity into account).18 Table 1 shows the sets of reference equations selected to calculate FEV1 predicted values in this study. The European Respiratory Society (ERS) does not recommend a particular set of reference equations but, in the Netherlands, the ECSC equations are commonly used for interpreting spirometry results.16 The current COPD guideline for Dutch GPs5 recommends a correction factor of 1.08 to adjust the ECSC equations for secular trend,19 which we applied in our study (see Table 1).

Table 1 FEV1 reference equations applied in the study

Data analysis

All analyses were performed using SPSS statistical software Version 20. Postbronchodilator FEV1 values, expressed in litres, were used to determine FEV1% predicted ((FEV1 measured/FEV1 predicted) x100). The number of subjects in each severity stage (according to GOLD) was calculated for all three sets of equations. In all analyses we considered the 1.08 corrected ECSC equations (‘ECSC-corrected’) as the main reference equations to which the more contemporary equations (Swanney, GLI) were compared. Contingency tables were made to compare the number of subjects in each of the four severity stages for ECSC-corrected versus Swanney and ECSC-corrected versus GLI. Bland–Altman plots were created to present the difference between FEV1 predicted values against the mean FEV1% predicted for ECSC-corrected and Swanney and ECSC-corrected and GLI equations, respectively. Agreement on the severity of obstruction between different reference equations was assessed using Cohen's kappa (κ) statistic. κ>0.7 was considered moderate to strong agreement.

Results

Study population

After applying our selection criteria, the final study population consisted of 3,370 subjects aged ≥40 years who had been referred for spirometry by their GP and were found to have airflow obstruction based on postbronchodilator FEV1/FVC <0.70. Table 2 shows the characteristics of the study population. The sample consisted of 38.5% (n=1,297) females and 61.5% (n=2,073) males. The overall mean (SD) postbronchodilator FEV1 values were 1.91 (0.71)L, 2.13(0.72)L, and 1.54(0.51)L for the whole study population and males and females, respectively.

Table 2 Characteristics of the study population

Severity staging according to the different reference equations

Table 3 shows the average measured postbronchodilator FEV1 and the average calculated FEV1 predicted values according to the different sets of reference equations. The mean FEV1 predicted values derived from the Swanney equations showed the lowest values for both females and males, followed by the GLI equations. Compared with males, females showed less difference between ECSC-corrected and Swanney equations and between ECSC-corrected and GLI equations (2.6% and 0.2%, respectively).

Table 3 Mean (SD) FEV1 predicted values according to the respective reference equations and the difference between the currently recommended equation (ECSC (corrected)) and Swanney and GLI equations

The Bland–Altman plots (Figure 1) show that the larger discordance between ECSC-corrected and Swanney equations compared with ECSC-corrected and GLI equations is determined by age and gender.

Figure 1
figure 1

Bland–Altman plots showing the difference between forced expiratory volume in one second (FEV1) predicted (in L) against mean FEV1% predicted by European Community of Steel and Coal (ECSC)-corrected and Swanney reference equations (blue) and ECSC-corrected and Global Lung Initiative (GLI) reference equations (green)

In the case of the Swanney equations, the divergence from the ECSC-corrected equations was higher in males than in females (overall 17.2% in males, 8.9% in females), which became more obvious as age increased. In female subjects the two equations only started to diverge significantly from each other at higher ages.

Table 4 shows the proportion of subjects categorised into the four severity stages when the three sets of reference equations are used. According to the current set of equations (ECSC-corrected), the majority (62.3%) of our study subjects were in GOLD stage 2 and only 3.1% were in GOLD stage 4, as might be expected in a primary care population.

Table 4 Number of subjects classified into airflow obstruction severity stages when using different sets of reference equations

Table 5 shows how the use of different reference equations changes the obstruction severity staging in the study population. Switching is more prominent when comparing the Swanney equations with the current ECSC-corrected equations. Overall, the use of the Swanney equations reclassified 14.0% of the study population, all of them into less severe stages. The Swanney equations reclassified 23.8% from the very severe to severe stage (from GOLD stage 4 to stage 3) and 23.3% of subjects from the severe to the moderate stage (stage 3 to stage 2). Fewer subjects (12.8%) shifted from moderate to mild (stage 2 to stage 1). The agreement between ECSC-corrected and Swanney according to Cohen's kappa (κ) was 0.75 (95% CI 0.54 to 0.97).

Table 5 Number of subjects in each stage and switches between stages of airflow obstruction severity according to (a) ECSC-corrected and Swanney and (b) ECSC-corrected and GLI reference equations

Comparison between ECSC-corrected and GLI equations showed less reclassification. GLI reference equations reclassified 6.3% of all study subjects, predominantly into less severe stages, but there were 31 subjects (0.9% of the total study population) who shifted to more severe stages. Seven subjects (6.7%) were reclassified from very severe to severe obstruction (from stage 4 to stage 3), 8.5% from severe to moderate obstruction (stage 3 to stage 2), and 5.1% from moderate to mild obstruction (stage 2 to stage 1). Agreement between ECSC-corrected and GLI was κ=0.89 (95% CI 0.87 to 0.90).

Discussion

Main findings

This study was a cross-sectional analysis of differences in staging severity of airflow obstruction when using different sets of reference equations in patients with obstruction according to current COPD guidelines. We aimed to establish the consequences of switching to more contemporary reference equations than those currently recommended in Dutch primary care (i.e. ECSC with a 1.08 correction factor) when interpreting spirometric results to stage COPD severity.

In order to determine FEV1% predicted we selected three equations: corrected ECSC as the current ‘standard’ in Dutch primary care, and Swanney and GLI as its potential ‘successors’ (at least in forthcoming Dutch guideline revisions). The original ECSC reference equations were derived from data from different study populations and several datasets in the years 1954–80.20 New spirometric equipment and changes in lung function due to secular trend over consecutive generations are among the reasons why the appropriateness of the original ECSC equations for interpreting spirometry results in the present time has been questioned. Because anthropometric characteristics and environmental factors change over time, it is recommended to update reference values for lung function regularly.6,13 Previous studies have shown that the ECSC equations underrate FVC and FEV1.10,2123 Because the current (2007) Dutch general practice guidelines for COPD recommend a 1.08 correction of the ECSC reference equations for FEV1,5 we decided to use these corrected equations instead of the original ECSC equations as the ‘standard’ equations with which to compare the other two equations. However, in reality, this correction will probably not be applied in all practices and patients. The equations published by Swanney et al. were derived from a Dutch general population cohort and, because of this, we selected them as potentially suitable replacements for the ECSC equations. However, the data from which these equations were derived were obtained in field surveys conducted at three-year intervals between 1965 and 1990,17 so they are also rather outdated. The most recently published reference equations — the GLI equations — are the first globally applicable equations which were derived from a large population consisting of 31,856 males and 42,331 females with an overall age range of 2.5–95 years for ethnic and geographic groups from 26 countries using state-of-the-art statistical modelling.18

In our study, when the Swanney equations were used, all reclassified subjects (14%) were categorised into less severe stages. However, when applying the GLI equations, most of the reclassified subjects were reclassified into less severe stages, but a small number (n=31, 0.9%) shifted to a more severe stage. Among these subjects were more females but, because the overall number is so small, we do not believe the gender difference is clinically meaningful when considering the total patient population. We observed a higher frequency of ‘under staging’ when comparing Swanney with ECSC-corrected equations than when comparing GLI with ECSC-corrected equations. Among the factors determining discordance, age seemed to be the most important factor, which was particularly evident in the case of the ECSC-corrected versus Swanney comparison. Both ECSC-corrected and Swanney reference equations have been extrapolated to older subjects. Differences in the regression analysis techniques used to derive these two equations are likely to contribute to some deviations in the elderly that are seen in our study. The GLI equations, on the other hand, have been derived from a population of wide age range and the deviations between ECSC-corrected and GLI equations do not seem to change significantly with increasing age. However, the combined populations from which the GLI equations were derived included extensive data from Caucasians aged between 3 and 75 years, but the authors report in their paper the need to obtain more data for older subjects (i.e. those aged >75 years).18 Gender was also found to be a source of discordance. The reasons behind this probably also lie in the way the reference equations were derived. The limited shift in severity classification when the GLI equations were used instead of the ‘outdated’ ECSC equations is partly due to the 1.08 correction factor on the ECSC prediction equations for FEV1 that is recommended in the Dutch GP guidelines,19 which we also applied in our study.

Strengths and limitations of this study

Because our data are from primary care diagnostic centres, the subjects included in our analysis are a representative sample of the primary care population who present with respiratory symptoms to a GP. In contrast to the general population studies on this issue, our study population has a much higher pre-test probability of a chronic lung disease and is more representative of subjects who are usually referred for spirometry testing. To our knowledge, no previous study has been reported about the differences in staging in a primary care setting with such a large study population. On the other hand, because the primary care diagnostic centres do not have access to the patients' medical records in the general practices, no formal diagnostic labels of COPD could be established. Although respiratory consultants assess all spirometry results and respiratory medical history, this is one of the limitations of our study.

Another point that could be considered a limitation by some is the fact that we used the fixed 0.70 FEV1/FVC cut-off point to select subjects from the databases of the primary care diagnostic centres and thus did not take age and sex into account when defining airflow obstruction in our study subjects. Although it is likely that guideline recommendations on how to define obstruction when diagnosing COPD will change in years to come — for instance, by shifting to a lower limit of normal approach for FEV1/FVC and/or FEV1 — we decided to take the current guideline5 as the point of departure for this study. Had we applied the lower limit of normal approach instead of the fixed FEV1/FVC cut-off in this study, our study population would have been reduced to approximately 74% of its current size.15 However, we chose to use the fixed 0.70 cut-off point that is currently recommended for our study because we wanted to estimate shifts in GOLD stages as they would occur in present-day practice.

Our study was based on postbronchodilator values, which is in concordance with the guideline recommendations.1,4,5 As in previously reported population studies, the use of prebronchodilator instead of postbronchodilator spirometry leads to overestimation of airflow obstruction15,24 and too high FEV1% predicted values.25

The GOLD report published in 2013 recommends a new system of classification in which severity stage and choice of treatment options is no longer solely based on the degree of airflow limitation.1 The severity staging now also incorporates measurement of respiratory symptoms and exacerbation history. In our current study the GOLD severity stages according to which we categorised our study population was limited to defining the degree of airflow obstruction only and not also the severity of the other aforementioned markers.

Interpretation of findings in relation to previously published work

Our findings support previous studies which have shown that continued use of ‘outdated’ reference equations for FEV1 leads to a higher incidence of airflow obstruction diagnoses and ‘overstaging’ of COPD severity.712 In our study we only focused on differences in staging when applying more contemporary equations (i.e. Swanney and GLI) in comparison with ECSC-corrected equations. The newer reference equations mostly predicted lower FEV1 values, resulting in higher FEV1% predicted values and in reclassification of 6.3% (for GLI) to 14.0% (for Swanney) of subjects into less severe stages.

Implications for future research, policy and practice

Spirometry is often the only diagnostic procedure available to assess lung function in primary care. The interpretation of spirometry results affects GPs' decisions on the diagnosis and staging of COPD and, consequently, may affect patients' treatment. Using reference equations that are outdated or not representative of the population can misinform decision-making by GPs, especially in patients whose FEV1% predicted is close to the upper or lower boundary of their current severity stage. As shown in several previous studies, treatment costs increase with obstruction severity stage.2628

It was not the objective of this study to address potential clinical implications of misclassification due to different reference equations. In the case of spirometric reference values, there is no particular set of equations that can formally be considered the ‘gold standard’ when defining normal lung function, although the new GLI equations are considered as such by leading respiratory experts.29 Relating different sets of suitable equations to clinically relevant outcomes (e.g. survival, exacerbation rate) would be required to establish which equations best predict patients' prognosis and thus the clinical implications (if any) of switching between equations. This is a challenging topic for further research.

The accuracy of the COPD severity stage may have an impact on the management of the disease as recommended in guidelines. If, for instance, after switching to the GLI reference equations a patient is no longer considered to have severe obstruction but to have moderate obstruction instead (as 8.5% of all patients with severe obstruction in our sample did), does this change the GP's idea about the necessity of referring this patient for pulmonary rehabilitation or not?1 A similar choice exists for bronchodilator treatment (long-acting or short-acting) in patients who switch from moderate to mild obstruction (5.1%).1 It is therefore important to have proper reference equations that allow appropriate interpretation of spirometric results, apposite severity staging, and suitable treatment.

Conclusions

This study shows that, compared with the (corrected) ECSC equations that are currently recommended for use in Dutch primary care, switching to more contemporary reference equations would result in lower FEV1 predicted values. As a result, this could affect GPs' interpretation of spirometry results by reclassifying approximately 6% (for GLI) or about 14% (for Swanney) of the COPD patient population into different (mostly milder) severity stages. If and how this will affect GPs' treatment choices and clinical outcomes in individual patients with COPD requires further investigation.