Introduction

Chronic obstructive pulmonary disease (COPD) is a prevalent chronic respiratory condition that is usually progressive and associated with an enhanced chronic inflammatory response in the airways and lungs to inhaled particles or gases.1 Current international clinical COPD guidelines state that spirometry is required to make the diagnosis.14 These guidelines state that the presence of a post-bronchodilator forced expiratory volume in 1 s (FEV1) over forced vital capacity (FVC) ratio below 0.70 confirms the presence of airway obstruction in subjects who are suspected of having COPD due to cumulative exposure to risk factors (e.g., tobacco smoke, occupational or indoor air pollution) and presence of respiratory symptoms that are compatible with COPD.14 There is increasing evidence that the use of an age- and sex-specific lower limit of normal value for FEV1/FVC would be a more appropriate approach when defining airway obstruction.5,6

In routine clinical practice, a COPD diagnosis is often based on a single initial spirometry test, and none of the above COPD guidelines recommend repeat spirometry testing when making the diagnosis of COPD, or indicate that a different result may be found if spirometry is repeated.14 Interestingly, although long-term variability has been reported for FEV1 and FVC in healthy subjects79 and for patients with COPD,10,11 and their year-to-year variability is reported in ERS/ATS lung function guidelines to be ±15%,12 similar evidence for variability of FEV1/FVC is lacking.

The aim of this study was, therefore, in a sample of subjects at risk for COPD, to investigate the long-term stability of a diagnosis on the basis of a once-only measurement of post-bronchodilator FEV1/FVC ratio in primary care. We examined shifts in diagnostic category (i.e., shifts between ‘obstructed’ and ‘non-obstructed’ and vice versa) after 1 year and 2 years. We used routine spirometry data from subjects who had entered a respiratory health diagnostic and annual monitoring service offered by primary care diagnostic centres in the Netherlands.

Results

Study subjects

Of 22,187 subjects in the databases, 2,352 fulfilled the inclusion criteria (Figure 1). At baseline, airway obstruction was identified in 758 (32.2%) subjects by the LLN definition and in 1,097 (46.6%) subjects by the fixed FEV1/FVC definition. Table 1 shows baseline characteristics for obstructed and non-obstructed subjects according to the two definitions.

Figure 1
figure 1

Selection of study subjects from the initial primary care diagnostic centres’ spirometry databases.

Table 1 Baseline (T0) characteristics of the study sample (n=2,352) for the two definitions of airway obstruction

The average time between the initial diagnostic spirometry and the first follow-up measurement was 1.26 (s.d. 0.56) years, and 1.13 (s.d. 0.46) years between the first and second follow-up measurement.

Agreement of obstruction status and FEV1/FVC values

Figure 2 shows the differences in post-BD FEV1/FVC between T0 and T1, plotted against the baseline (T0) value. The coefficient of repeatability of the within-subject difference between two consecutive FEV1/FVC measurements was 0.163 for the T0–T1 time interval, 0.157 for the T1–T2 time interval and 0.176 for the T0–T2 time interval.

Figure 2
figure 2

Difference between FEV1/FVC values measured at baseline (T0) and after 1 year (T1), plotted against T0. The coefficient of repeatability of the difference between the FEV1/FVC measurements at T0 and T1 was 0.115. BD, bronchodilator; FEV1, forced expiratory volume in 1 s; FVC, forced vital capacity; LLN, lower limit of normal.

Figure 3 shows the shifts between diagnostic categories by LLN (Figure 3a) and fixed ratio criterion (Figure 3b). According to the LLN criterion, 77.8% of subjects categorised as obstructed at baseline were still categorised as having airway obstruction after 1 year, and after 2 years only 67.8% had obstruction (Figure 3a). Figure 3a also shows mean (s.d.) changes in FEV1, FVC and FEV1/FVC between T0 and T1 and between T1 and T2 in the respective categories. Of the subjects without baseline airway obstruction (n=1,594), 90.1% remained unobstructed after 1 year and 85.1% after 2 years. Agreement between obstruction/non-obstruction status was ‘substantial’ when comparing T0 with T1 (Kappa=0.691, 95% confidence interval (95% CI)=0.660–0.722) and T0 with T2 (Kappa=0.671, 95% CI=0.640–0.702) classifications.

Figure 3
figure 3

Change in obstruction status between baseline, year 1 and year 2 in respiratory symptomatic smokers and ex-smokers aged 40+ years. (a) Based on post-bronchodilator FEV1/FVC< or LLN. *Denominator for all proportions in the downstream cells. Indicates 12±2 months after previous test. BD, bronchodilator; FEV1, forced expiratory volume in 1 s (litres); FVC, forced vital capacity (litres); LLN, lower limit of normal. ∆FEV1, ∆FVC and ∆FEV1/FVC calculated as T1 minus T0, and T2 minus T1, respectively and reported as mean (s.d.). (b) Based on post-bronchodilator FEV1/FVC< or 0.70.

According to the fixed FEV1/FVC definition, 83.0% of initially obstructed subjects were still categorised as having airway obstruction after 1 year, and after 2 years only 76.2% had obstruction (Figure 3b). Of the subjects without baseline airway obstruction (n=1,255), 87.5% remained unobstructed after 1 year and 81.3% after 2 years. Again, agreement between obstruction/non-obstruction status was ‘substantial’ when comparing T0 with T1 (Kappa=0.707, 95% CI=0.678–0.736) and T0 with T2 (Kappa=0.695, 95% CI=0.666–0.724) classifications.

Figure 4 shows that diagnostic shifts were observed across the full range of baseline FEV1/FVC values. Numbers (%) of patients with borderline results in terms of the fixed FEV1/FVC definition (e.g., FEV1/FVC between 0.68 and 0.72) were 337 (14.3%) at T0, 319 (13.6%) at T1 and 321 (13.6%) at T2. Of the 315 patients who shifted category in either direction between T0 and T1, only 65 (20.6%) were in the 0.68–0.72 range at T0.

Figure 4
figure 4

Probability of being non-obstructed after 1 year (T1) in relation to a subject’s post-BD FEV1/FVC at baseline T0. The graph shows moving averages based on two consecutive data points (i.e., values for the probabilities in the actual FEV1/FVC bin and the next bin) to ‘smooth’ the curve. BD, bronchodilator; FEV1, forced expiratory volume in 1 s; FVC, forced vital capacity.

Factors associated with shifting between diagnostic categories

Several factors were independently associated with shifting from being obstructed at T0 to being non-obstructed at T1 when applying the LLN FEV1/FVC criterion. Higher body mass index (BMI) and baseline short-acting bronchodilator use increased the odds of shifting to non-obstructed (Table 2). Older age, baseline post-BD FEV1 <50% predicted, baseline inhaled corticosteroid use, and being a current smoker or using a short-acting bronchodilator at T1 reduced this odds.

Table 2 Results from multivariable logistic regression models looking at factors associated with diagnostic shift between baseline (T0) and 1-year measurements (T1) using post-BD lower limit of normal (LLN) FEV1/FVC cut-off points to define the presence or absence of airway obstruction

Being male, older age, lower baseline post-BD FEV1% predicted, and being current smoker at baseline increased the odds of shifting from being non-obstructed to being obstructed at T1, whereas higher baseline BMI reduced these odds (Table 2).

Bronchodilator reversibility was not significantly associated with diagnostic shift in either direction.

Discussion

Main Findings

In primary care, establishing the presence or absence of airway obstruction when diagnosing COPD is often based on a single spirometry test. This is consistent with clinical guidelines for the diagnosis of COPD, which do not suggest that spirometry should be repeated to confirm the diagnosis. However, the short-term and long-term stability of a decreased post-bronchodilator FEV1/FVC ratio has not been reported. We investigated the shifts between diagnostic categories after an initial spirometry test in subjects for whom guidelines recommend investigation for COPD (age 40, current or ex-smoker, with respiratory symptoms) and who were referred to a diagnostic service by their general practitioner (GP). We found that, depending on the definition of airway obstruction applied, after 1 year, 17–22% of subjects with airway obstruction at baseline were no longer categorised as such, and after 2 years, 24–32%; this shift was observed across a wide range of baseline FEV1/FVC values. Of subjects without airway obstruction at baseline, 10–13% were no longer non-obstructed after 1 year, and 15–19% after 2 years. With Kappa values in the range of 0.67–0.71 when comparing obstruction/non-obstruction status at baseline and after 1 and 2 years, respectively, agreement at a population level would be considered ‘substantial’, but the implications for individual patients may be important. Gender, age, BMI, post-BD FEV1% predicted, smoking status, and use of short-acting bronchodilators and inhaled corticosteroids were associated with shifts between diagnostic categories in the logistic regression models, but bronchodilator reversibility was not.

Interpretation of findings in relation to previously published work

It is well accepted that FEV1 and FVC vary over time in both healthy persons and those with airways disease, with year-to-year variability reported as ±15% in the 2005 ATS/ERS guidelines for lung function testing,12 and that bronchodilator reversibility in patients with COPD varies when measured at 4-weekly intervals.13 However, the extent to which the FEV1/FVC ratio itself varies does not appear to have been documented. In a secondary analysis of the Lung Health Study dataset, which consists of 5-year follow-up data from 5,321 current smokers aged 35–60 years with mild-to-moderate obstructive pulmonary disease (defined at the time as baseline FEV1/FVC ratio 0.75 and FEV1 50–90% predicted),14 Akkermans et al.15 observed that classification as obstructed/non-obstructed was inconsistent for 24% of Lung Health Study participants between the initial screening and the first follow-up spirometry at 1 year. In another study examining the relationship between baseline obstruction and lung function decline from the present database, we noted that 36% of participants were excluded as they had changed obstruction category during an average of 3.4 years follow-up.16 However, we have not been able to trace any other studies that have reported the short-term or long-term consistency of a spirometric diagnosis of airway obstruction in subjects with COPD-like symptoms.

Factors associated with diagnostic shift

Several factors were associated with shifting from obstructed to non-obstructed over 1 year. Age was a significant predictor of shift (in either direction), even with the age-adjusted LLN criterion. Other significant predictors included male gender, lower baseline FEV1% predicted and current smoking, all known predictors for the development of COPD.17 Higher baseline BMI was significantly associated with shifts in both directions—increasing the probability of shifting from obstructed to non-obstructed, and reducing the probability of shifting from non-obstructed to obstructed. Baseline bronchodilator reversibility (which might indicate greater underlying variability in lung function consistent with asthma) was not associated with diagnostic shift in either direction. Only limited information was available about medications, but use of inhaled corticosteroids at baseline appeared to reduce the probability of shifting from obstructed to non-obstructed.

Strengths and limitations of this study

Particular strengths of our study are the large sample of subjects from primary care for whom guidelines recommend investigation for COPD (respiratory symptoms, age 40, smoker/ex-smoker), the fact that all spirometry tests were performed by certified technicians using standardised protocols and equipment,18 that both pre- and post-bronchodilator spirometry were performed, and the existence of regional primary care programmes for ongoing monitoring of lung function as a part of routine patient care.

The study has some limitations as well. First, we assume that the patients being seen longitudinally in the primary care diagnostic centres are representative of a larger population, but we have no information on the precise reasons why some patients were scheduled for annual follow-up visits, whereas other patients were not. Selection may have occurred, as patients in the monitoring service may have been patients of special concern to their GPs; therefore, we cannot exclude the possibility that the number of variable spirometric findings was falsely elevated because of this. Also, patients with severe and unchanging disease may have been referred to secondary care medical specialists and have been lost to the primary care diagnostic centre monitoring service and, consequently, to our dataset. This might have caused a bias away from seeing consistent findings from year to year. Another limitation of the study is the fact that, because these follow-up visits are scheduled annually, we were not able to look at the variability of FEV1/FVC over shorter periods of time—for instance, within-day or week-to-week—as already reported for FEV1 and FVC. Further research is needed to establish the optimal interval between the initial spirometry test and a ‘verification test’ after some weeks or months. Clearly, regression to the mean effects could have a role in explaining our observations as, by chance alone, subjects with more extreme FEV1/FVC values are likely to show less extreme values at reassessment; furthermore, as the diagnosis of COPD is currently based on a specific FEV1/FVC value (whether ⩽0.70), trivial changes could lead to a diagnostic shift for subjects with a baseline ratio just below or just above the relevant cut-off point. However, only about 20% of patients with a diagnostic shift had been in the borderline range of 0.68–0.72 at baseline, and diagnostic shifts were observed across the full range of baseline FEV1/FVC values (Figure 4). Finally, subjects non-obstructed at baseline may have been less likely to be enrolled in a diagnostic centre’s monitoring service and may therefore be under-represented in our dataset.

In clinical practice, a COPD diagnosis is often based on a single spirometry test. This is consistent with current guidelines, which recommend that smoking or ex-smoking subjects aged 40 years with respiratory symptoms should be investigated for COPD and that the diagnosis should be ‘based on spirometry’, with no indication that it should be repeated to confirm the diagnosis. Given the importance of the FEV1/FVC ratio in making the diagnosis of COPD, and the known year-to-year variability of FEV1 and FVC of ±15%, we found it surprising that the variability of the ratio does not appear to have been reported previously. The current study shows that the FEV1/FVC ratio varies significantly over 1- and 2-year periods in subjects at a risk for COPD. Depending on the criterion for obstruction that is applied, one-off spirometry may lead to over- or under-diagnosis of COPD, and either of these may have a significant emotional impact on the patient.19 Further, diagnostic inaccuracy will almost certainly lead to over-treatment of some patients, with increased healthcare costs, increased risk for adverse effects and delay in identifying other treatable causes of respiratory symptoms, whereas other patients may be under-treated for COPD, contributing to unnecessary burden of disease. Clinical COPD guidelines should take this into account and recommend repetition of spirometry to verify the presence or absence of airway obstruction. An alternative view that has been increasingly heard in recent years is that the diagnosis of a heterogeneous multi-system condition such as COPD should not be based on a single number.6,20,21 This is especially relevant for primary care, where the vast majority of patients with early and mild COPD are diagnosed and treated.

Conclusions

Although overall agreement between baseline and repeated diagnostic classification of airway obstruction may be technically classified as ‘substantial’ at a population level, a key finding of the present analysis was that up to one-third of people at risk for COPD who were found to have airway obstruction when referred for diagnostic spirometry by their GP had shifted to non-obstructed when routinely re-tested after 1 or 2 years. Similar shifts were seen with LLN and fixed-ratio criteria. Gender, age, BMI, baseline FEV1% predicted, smoking status and use of respiratory medication were associated with the probability of change in diagnostic category, but broncho-dilator reversibility was not. Given the implications described above for patients and the healthcare system, we do not believe that the diagnosis of COPD should be based on a single spirometry test.

Materials and Methods

Study setting and measurements

This observational study was based on all available spirometry tests from the period October 2001 to March 2010 from three regional primary care diagnostic centres (i.e., General Practice Laboratory Foundation Etten-Leur/Breda (SHL); Diagnostic Center Eindhoven (DC4U) and General Practice Laboratory East (SHO)) in the Netherlands. These diagnostic centres have offered a range of diagnostic tests, including spirometry, and other health services to hundreds of GPs in the south-western and south-eastern parts of the country since the mid- or late 1990s. When a subject consults his/her GP with respiratory symptoms and the GP suspects an underlying chronic respiratory condition (e.g., COPD or asthma), the subject can be referred to the diagnostic centre for pre- and post-bronchodilator spirometry testing. When a chronic respiratory condition is diagnosed or still suspected, the GP will usually enrol the subject in the diagnostic centre’s routine monitoring service for periodic (usually yearly) reassessment without further clinical selection.

As previously reported,18 all spirometry tests in the primary care diagnostic centres are performed by certified lung function technicians using personal computer-based digital volume sensor spirometers (SpiroPerfect; Welch Allyn, Delft, The Netherlands) and standardised calibration and measurement procedures.18 Subjects are instructed to withhold all bronchodilators before spirometry. The spirometry test results and accompanying demographic (gender, age), anthropometric (height, weight) and medical history information (including self-reported smoking status and history, respiratory symptoms, respiratory medications and exacerbations) are recorded during each visit using a standardised electronic format. Every spirometry test is assessed by a respiratory consultant and his/her interpretation and—if applicable—diagnostic advice is sent to the GP, together with the actual test results. Further details about the spirometry tests performed in the diagnostic centres are described elsewhere.18 At the time, the diagnostic centres did not electronically store spirometry test quality assessments in their databases. As only routine lung function and respiratory medical history data were used for our analyses and the investigators had no access to information on subjects’ identity or their medical records, no written informed consent was obtained.

Subject selection and definitions for airway obstruction

For the current study, we selected subjects from the combined primary care diagnostic centres’ databases (n=22,187)16 who had risk factors for COPD and had complete questionnaire data and follow-up spirometry. The inclusion criteria were the following: being Caucasian; current or former smoker; aged 40 years; complete data regarding height, history of cigarette smoking and respiratory medication use; and three or more post-bronchodilator spirometry tests available with 12±2 months (10–14 months) between tests.

We used post-bronchodilator FEV1/FVC values to classify the subjects as having or not having airway obstruction. The following two definitions of airway obstruction were applied:

  • LLN cut-off (primary definition): post-bronchodilator FEV1/FVC below the subject’s age-specific lower limit of normal (LLN) value.22 Airway obstruction was classified as being present when the resulting standard deviation (s.d.) score (also known as ‘standardised z-score’) was <−1.645. This corresponds with the fifth percentile.

  • Fixed cut-off point (secondary definition): post-bronchodilator FEV1/FVC <0.70. This is the criterion for airway obstruction that is still recommended in clinical COPD guidelines.14

Global Lung Initiative prediction equations23 were used to calculate the LLN values for FEV1/FVC and %predicted values for FEV1.

Analysis

Subjects were categorised as showing airway obstruction or not at their consecutive measurements (baseline (T0), 1 year (T1) and 2 years (T2)). The Kappa statistic and its 95% CI were calculated to express the level of agreement between T0 and T1 and between T0 and T2 diagnostic status, respectively. The following classification in terms of strength of agreement for the kappa coefficient was used: Kappa 0=poor agreement; 0.01 to 0.20=slight; 0.21 to 0.40=fair; 0.41 to 0.60=moderate; 0.61 to 0.80=substantial; and 0.81 to 1=almost perfect.24 A modified Bland–Altman25 plot was generated to graphically express differences in FEV1/FVC between T0 and T1, compared with the baseline (T0) value, and the coefficient of repeatability26 was calculated to express the within-subject repeatability (or absolute reliability) of the two consecutive FEV1/FVC measurements. The coefficient of repeatability is the value below which the absolute differences between two measurements would lie with 0.95 probability. Both random and systematic errors are taken into account in the coefficient.26

In univariate analyses we calculated the probability of being (non-)obstructed after 1 year in relation to a subject’s baseline post-bronchodilator FEV1/FVC value. We also explored subject characteristics that predicted shifting diagnostic category between T0 and T1 measurements for the primary (LLN) definition of obstruction using multivariable logistic regression models. Covariates in these analyses were gender, age (at T0), BMI (at T0), severity of airway obstruction (categorised according to GOLD as mild, moderate and (very) severe obstruction, based on % predicted FEV1, at T0),1 reversibility of obstruction (yes/no ∆FEV1 200 ml and 12%, at T0),27 smoking status (current smoker yes/no, at T0 and T1), use of short-acting bronchodilators (yes/no, at T0 and T1), use of long-acting bronchodilators (yes/no, at T0 and T1) and use of inhaled corticosteroids (yes/no, at T0 and T1). The changes in smoking status, short-acting bronchodilator use, long-acting bronchodilator use, and inhaled corticosteroid use were expressed in the models using interaction terms of the respective T0 and T1 covariates, but dropped at a later stage as they were not statistically significant. Separate models were constructed for each of the two possible directions of shifting (i.e., from obstructed to non-obstructed or vice versa). Associations were expressed as odds ratios with 95% CIs. Tests were two-sided. P<0.10 was considered statistically significant. IBM SPSS Statistics Version 22 (Armonk, NY, USA) was used for the analyses.

Funding

The extraction of data from the primary care diagnostic centre's databases was supported by a research grant from Boehringer Ingelheim, the Netherlands.