Introduction

Traumatic spinal cord injury (SCI) affects the lives of ~153 000 people in the European Union (estimated prevalence of 0.03%).1 The clinical condition of an SCI patient is closely followed for acute care management, for capturing outcomes of rehabilitation and for the treatment of secondary complications in chronic SCI. Follow-up programs are often especially interested in revealing neurological and functional recovery, with the ultimate aim of understanding profiles of clinical outcomes for improving care and treatment.2 For these purposes, clinical assessments need to be able to correctly capture a patient's health status, its improvements and deteriorations. These properties are known as validity and responsiveness.3 A responsive assessment is able to reliably detect changes in a patient's health development over time based on a widely accepted ‘gold standard’ as reference. An assessment is considered valid if it exhibits a high association with such a reference assessment of the disorder under study. This concept can be further subdivided into concurrent and predictive validity, where the former assesses the association with the reference at the same time, and the latter the association with the reference at a later time point.4 In this paper we focus on concurrent validity.

Concurrent validity of clinical assessments is of interest in translational research, not only for SCI but also for other medical disorders (for example, stroke5 and multiple sclerosis6). In the field of SCI, methods to assess concurrent validity currently include Pearson correlation7, 8, 9, 10, 11, 12, 13 and Spearman’s rank correlation13, 14, 15, 16, 17 to measure the association between a single walking assessment and a single reference, for example, the correlation between the functional independence measure (FIM) and the sum score of the lower extremities motor scores. We consider this approach to be suboptimal in two ways. First, there are several walking assessments and several clinical reference scores for SCI patients, and the choice of which to use is subject to a certain degree of arbitrariness. Second, virtually all SCI assessments consist of several subassessments (for example, questionnaire questions, tasks), and their use as summed total scores, instead of considering the tasks separately, may prevent deeper insight. To overcome these limitations, we propose the use of canonical correlation analysis (CCA)18 to assess concurrent validity, as CCA is a statistical technique, which measures the correlation between groups of variables, and as such is more in accordance with the concept of concurrent validity in the field of SCI. Although almost all SCI-related clinical assessments are groups of variables, CCA has only been applied in the research on psychological aspects of SCI,19, 20, 21 and never to assess the concept of concurrent validity in the clinical research.

By analyzing a large prospectively collected longitudinal dataset, this study aims for the following: (1) to assess the concurrent validity of single and groups of walking assessments, and (2) to determine minimal groups of walking assessments with a high concurrent validity—both based on CCA.

Materials and methods

Participants

The data are provided by the European Multicenter study about Spinal Cord Injury (EMSCI, www.emsci.org, clinicaltrials.gov NCT01571531), which is an ongoing longitudinal study of patients who suffered from acute SCI. At the time of analysis, the EMSCI database included 2854 patients over 13 years of study time (from 2001 to 2013), from 22 centers. The data contain the results of examinations at five different times (stages): within the first 2 weeks after injury (stage 1), after 4 weeks (stage 2), after 3 months (stage 3), after 6 months (stage 4) and after 12 months (stage 5).

Assessments

At each stage various clinical assessments are performed to assess a patient’s neurological and functional abilities. We divide them into motor assessments and walking assessments.

Motor assessments measure neurological abilities and are often used as a reference assessment, which is why we choose to use them as our reference for concurrent validity. In particular, we consider the ten motor scores for the five lower extremities spinal cord segments (L2, L3, L4, L5 and S1) of both sides, as defined by the International Standard for Neurological Classification of Spinal Cord Injury (ISNCSCI).22 Each such motor score is an ordinal variable ranging from 0 to 5. These lower extremities motor scores (LEMS) were operationalized in the following two ways:

  1. 1

    Sum score of the LEMS (S-LEMS), ranging from 0 to 50.

  2. 2

    Individual LEMS (I-LEMS), referring to all ten separate motor scores, each ranging from 0 to 5.

Walking assessments measure walking abilities. They can be categorized into timed walking assessments (1, 2 and 3 below) and untimed walking assessments (4 and 5 below):

  1. 1

    The timed up and go test (TUG) measures the time (in seconds) a patient needs, with or without the preferred walking aid, to get up from sitting on a chair, walk three meters, turn around, walk back and sit down again. It is a measure of mobility and balance.23

  2. 2

    The 10-meter walk test (10MWT) measures the time (in seconds) a patient needs to walk 10 m with his or her preferred walking speed.24

  3. 3

    The 6-min walk test (6MWT) measures how far (in meters) a patient can walk within 6 min. Rests are allowed if needed.25

  4. 4

    The walking index for spinal cord injury (WISCI) assesses the ability to walk based on the amount and type of physical assistance and devices that are needed. It ranges from 0 to 20, where 0 indicates most severe impairment.14

  5. 5

    The spinal cord independence measure (SCIM) is a disability scale for SCI and is composed of several functional assessments measuring activities of daily life. In our study, we consider two subassessments: patient’s mobility for room and toilet (SCIM3a), ranging from 0 to 10, and mobility indoors and outdoors on even surface (SCIM3b), ranging from 0 to 30.7 In both cases, 0 indicates most severe impairment.

For ease of interpretation, we transformed the assessments of TUG and 10MWT to ensure that a higher value indicates a better condition of the patient (multiplying by −1 and shifting to make the values positive again).

Missing data and patient subgroups

The data contain many missing values, ranging from 30 to 85% over the various assessments and the five stages. This is partly due to the protocol design of the prospectively collected walking assessments, where functional assessments were mostly performed in patients with some walking abilities. Thus, in less impaired patients (that is, incomplete SCI), both timed and untimed walking assessments are typically present. In patients with complete SCI and poor prognosis, however, timed walking outcomes are generally missing because they could not be performed by the patient.

At each stage, we therefore considered two patient subgroups. The first patient subgroup encompasses all patients without missing information in timed and untimed walking assessments and LEMS (patient subgroup I; less impaired). Statistically speaking, this is a complete case analysis with respect to all variables. The second subgroup encompasses all patients without any missing information in untimed walking assessments and LEMS (patient subgroup II). Statistically speaking, this is a complete case analysis with respect to the untimed walking assessments and LEMS. Thus, subgroup II allows missing information in timed walking assessments, and any patient in subgroup I is also included in subgroup II.

This grouping based on the available data leads to different group sizes for different stages. In particular, a patient may be able to perform timed walking tests only at later stages, at which point he or she will be included in subgroup I.

Statistical analysis

CCA13 measures the correlation between two groups of variables. Hence, it can be used to assess concurrent validity in a clinical setting like SCI, where there is a group of walking assessments and a group of clinical reference assessments.

Assume that X={X1,…,Xp} is a group of p assessments (for example, p=2, X1=6MWT and X2=10MWT) and Y={Y1,…,Yq} is a group of q external references (for example, q=10, Y1,…,Y10 are the I-LEMS). CCA determines a linear combination of the X-variables, which we call U, and a linear combination of the Y-variables, which we call V. The new variables U and V are referred to as canonical variates. The coefficients for the linear combinations U and V are chosen such that the correlation between the canonical variates is maximized. Technically, this means that CCA solves the following maximization problem:

where a=(a1,…,ap) and b=(b1,…,bq) are referred to as the coefficient vectors. The main output of interest for us is the achieved maximum correlation, called canonical correlation. In general, this can be interpreted as the overall association between the two groups of variables, and in our setting this corresponds to the extent of concurrent validity. The coefficient vectors a and b are often difficult to interpret, especially if the variables within the two groups are highly correlated, as is the case in our setting. Rather, we recommend looking at the correlations between each canonical variate and its composing original variables to understand the meaning of the canonical variates. Please see the Supplementary Information for an example, and Sherry and Henson26 for a detailed and user-friendly introduction to CCA.

Regarding the use of the reference group of I-LEMS in CCA, this allows for more flexibility than current approaches based on S-LEMS (or other summed total scores), as those implicitly fix the corresponding coefficients to one when calculating the total sum of unweighted scores. As a comparison, we also present CCA with S-LEMS as the reference assessment.

As several of our variables are ordinal, we use a nonparametric correlation in the above formula, namely Spearman’s rank correlation. CCA with Spearman correlation is implemented in the open source statistical software R27 in the function maxCorGrid of the R-package ccaPP,28 which we use for our analyses.

Results

We present the results separately for each patient subgroup.

Patient subgroup I: no missing information in any of the variables; typically less impaired patients

Table 1 shows the Spearman canonical correlation between each single walking assessment and one of the following references: S-LEMS and I-LEMS. We recall that S-LEMS is a single variable, formed by the sum of the ten motor scores. Hence, when the reference is S-LEMS, the Spearman canonical correlation between a walking assessment and S-LEMS is the same as the absolute value of the usual marginal Spearman correlation between the walking assessment and S-LEMS. When the reference is I-LEMS, however, the ten motor scores are treated as ten different variables, and they are combined via a linear combination to form the canonical variate that maximizes the correlation with the respective walking assessment. As a result, Spearman CCA with I-LEMS as reference can lead to markedly higher correlations.

Table 1 Spearman canonical correlations between each individual walking assessment and one of the following references: I-LEMS or S-LEMS

Stage 5 (12 months after SCI) is used as the final outcome stage in many studies and is therefore considered of high relevance for clinical evaluations. Considering the results of this stage with I-LEMS as reference, SCIM3b, WISCI, 10MWT and 6MWT all had a similarly high concurrent validity. The canonical variates corresponding to I-LEMS were typically similarly positively correlated with each of the I-LEMS scores, with correlations between 0.54 and 0.81, suggesting that such canonical variates can be interpreted as a summary (weighted average) of all I-LEMS. The only exception in this stage was SCIM3a, where the canonical variate was slightly negatively correlated with half of the I-LEMS, in tendency with those of the lower segments. We refer to the Supplementary Information for a more detailed discussion of the CCA output, using the analysis of SCIM3b and I-LEMS as an example.

The same four variables SCIM3b, WISCI, 10MWT and 6MWT were also top ranking when using S-LEMS as reference. In both analyses (using I-LEMS or S-LEMS), SCIM3a exhibited poor concurrent validity. Owing to the low number of observations in stage 1 for patient subgroup I, we omit this stage in the remainder of the results for this patient subgroup.

Table 2 shows the Spearman canonical correlations for groups of walking assessments with respect to one of the following references: S-LEMS and I-LEMS. Again considering the fifth stage, we see that groups of three walking assessments yield canonical correlations in the range of 0.66–0.71 with respect to S-LEMS, and in the range of 0.70–0.75 with respect to I-LEMS. In both cases, the highest canonical correlation is achieved by the group of walking assessments consisting of SCIM3a, WISCI and 10MWT. The resulting canonical correlations are very close to the canonical correlation of all six walking assessments (0.71 with respect to S-LEMS and 0.75 with respect to I-LEMS). Hence, CCA suggests that the group of walking assessments formed by SCIM3a, WISCI and 10MWT is about as good in terms of concurrent validity as are all six walking assessments together. When looking at the CCA results with I-LEMS as reference in more detail, we find that the correlations between SCIM3a, WISCI and 10MWT with the derived canonical variate are 0.32, 0.87 and 0.94, respectively. This indicates that WISCI and 10MWT dominate this canonical variate, whereas SCIM3a is less important. The correlations of the individual I-LEMS with the second canonical variate are in the range of 0.53–0.81, suggesting again that this canonical variate is a summary of all I-LEMS.

Table 2 Range of Spearman canonical correlations between all possible groups of walking assessments and one of the following references: I-LEMS or S-LEMS

We now focus on the canonical correlations using I-LEMS as reference. Figure 1 displays the canonical correlations of various subgroups of walking assessments. In particular, the gray area displays the range of canonical correlations achieved by different subgroups of three walking assessments. Furthermore, the figure displays the canonical correlations of all walking assessments, of the best subgroup of size three (SCIM3a, WISCI and 10MWT), of all untimed walking assessments (SCIM3a, SCIM3b and WISCI), and of all timed walking assessments (10MWT, 6MWT and TUG). We see that the canonical correlation of the best subgroup of size three is almost as high as that of all six assessments together. The group of untimed walking assessments generally outperforms the group of timed walking assessments, with the exception of stage 4 (6 months after SCI).

Figure 1
figure 1

Spearman canonical correlations between various subgroups of walking assessments and I-LEMS. The results are given for patient subgroup I at stages 2–5.

Patient subgroup II: no missing information in untimed walking assessments and LEMS

In this section we focus on untimed walking assessments, as timed walking assessments are not available for all patients in subgroup II. Table 3 shows the Spearman canonical correlation between each individual (untimed) walking assessment with respect to S-LEMS and I-LEMS. WISCI distinctly shows the highest correlation in both analyses, except for stage 1.

Table 3 Spearman canonical correlation between each individual untimed walking assessment and one of the following references: I-LEMS or S-LEMS

Table 4 displays the range of canonical correlations obtained by subgroups of untimed walking assessments with respect to S-LEMS and I-LEMS. Again considering the fifth stage, the results indicate that the canonical correlation of the single walking assessment WISCI is about as high as that of the group of all three untimed walking assessments.

Table 4 Range of Spearman canonical correlations between each possible group of untimed walking assessments and one of the following references: I-LEMS or S-LEMS

Figure 2 displays the canonical correlations of various subgroups of untimed walking assessments, using I-LEMS as reference. The gray area displays the range of canonical correlations of groups of two walking assessments. Furthermore, the figure displays the canonical correlations of all untimed walking assessments, as well as of the single walking assessment WISCI. We see that WISCI is about as good in terms of concurrent validity as the group of all three untimed walking assessments, except in the first stage.

Figure 2
figure 2

Spearman canonical correlations between various subgroups of walking assessments and I-LEMS. The results are given for patient subgroup II at stages 1–5.

Discussion

CCA measures the correlation between two groups of variables, by determining a linear combination of the variables in each group, such that the correlation between them is maximized. This is the first study making use of CCA to establish the concurrent validity in settings where a multivariate comparison of two groups of variables of possibly different sizes is required, and as such overcomes the limitations of commonly employed single comparisons. Although virtually all SCI-related assessments in clinical research are groups of variables, CCA has not been applied before. In addition, CCA allows to evaluate the potential redundancy between clinical assessments and the identification of minimal sets of assessments that provide the highest concurrent validity to assess functional recovery. These findings will be helpful to select assessments in clinical studies.

CCA for measuring concurrent validity

Previous studies mainly relied on Spearman and Pearson correlations, using as external reference the FIM,7, 8, 9, 10, 11, 14, 15, 16 the locomotor FIM,17 the S-LEMS15, 16, 17 or other walking assessments.12, 13

Our results for patient subgroup I of less impaired patients indicate that 12 months after injury, SCIM3b, WISCI, 10MWT and 6MWT all had similarly high concurrent validity with respect to I-LEMS or S-LEMS. Only SCIM3a exhibited an overall poor concurrent validity.

Itzkovich et al.,7 Glass et al.,8 and Bluvshtein et al.11 investigated concurrent validity of SCIM3 on admission, and all found a high concurrent validity, where their judgment is based on Pearson correlations with FIM (0.79 and 0.779 for two raters in Itzkovich et al.,7 0.798 and 0.782 for two raters in Glass et al.,8 and 0.839 and 0.835 for two raters in Bluvshtein et al.11). Invernizzi et al.9 and Anderson et al.10 also investigated the concurrent validity at discharge, found to be 0.91 and 0.80, respectively. Differences in results to ours might stem from their choice of reference assessments, and their patient group potentially differing from our patient subgroup I, as they did not require patients to be able to perform timed walking assessments. Moreover, their studies did not divide the SCIM3 into its subassessments SCIM3a and SCIM3b. The only study also looking into SCIM subassessments is Invernizzi et al.,9 who found a slightly higher concurrent validity with respect to FIM for SCIM3b (0.92) than for SCIM3a (0.82). However also their results do not attribute such a low concurrent validity to SCIM3a as ours, but show that the differentiation between these subassessments is important.

In line with our results, WISCI was also judged of good validity by Ditunno et al.14 (Spearman correlation with FIM across nine professionals of 0.765), and Morganti et al.15 (Spearman correlation with LEMS of 0.58). Ditunno et al.16 and Ditunno et al.17 gave results for the comparison with LEMS 12 months after injury and found a Spearman correlation of 0.88 and 0.91, respectively. However, again their patient group potentially differs from ours.

Van Hedel et al.13 attested validity to all timed walking assessments, due to their high Pearson correlations (above 0.88) with each other, and slightly lower Spearman correlations with WISCI (above 0.60).

Finally for this patient subgroup I, our results indicate that groups of three walking assessments already yield high canonical correlations, being as valid as all six walking assessments together. The subgroup of untimed walking assessments showed in general a higher validity than the subgroup of timed walking assessments.

In patient subgroup II, where we included all patients with untimed walking assessments, WISCI distinctly exhibited the highest correlation in all analyses, and WISCI alone was about as valid as the group of all three untimed walking assessments together, except in the first stage. These findings support the findings of Ditunno et al.,14 Morganti et al.,15 Ditunno et al.16 and Ditunno et al.,17 but partially contradict Itzkovich et al.,7 Glass et al.,8 Invernizzi et al.,9 Anderson et al.10 and Bluvshtein et al.,11 as their findings indicate a high validity of SCIM3. SCIM3b also exhibited a rather high concurrent validity in our study, but SCIM3a not as much. However their studies relied on FIM as the reference assessment, and did not compare the concurrent validity of SCIM3 with WISCI.

Only two studies have investigated the concept of concurrent validity in patient subgroups of better and worse walking ability. The analyses by van Hedel et al.13 showed a higher correlation between the timed walking assessments for patients with more severe (WISCI 0–10) than with less severe impairment (WISCI 11–20), indicating a slightly lower validity in less severely impaired patients. In contrast, Amatachaya et al.12 based their subgroup definition on values of the locomotor FIM and found that 10MWT and 6MWT were more concurrently valid in patients with good walking ability. Both these results can however not be directly compared with ours, as their definitions were based on a cut-off in WISCI or locomotor FIM, and all patients were still required to perform the timed walking assessments. Our definition of less impaired patients was based on the availability of timed walking assessments, as more severely impaired patients can rarely undergo these.

Comparison of CCA between patient subgroups

The CCA results were interpreted separately within the two patient subgroups. The canonical correlations for patient subgroup II, that is, across all levels of severity, were noticeably higher than those for patient subgroup I, that is, for less impaired patients. This difference may be partly due to the fact that patient subgroup II is more heterogeneous.

As in previous studies,12, 13 our findings reveal that the choice of an appropriate, in this case concurrently valid, assessment should be based on the severity and/or the level of the SCI lesion and the expected outcome of walking ability.

CCA for measuring predictive validity and external responsiveness

Although focusing on concurrent validity in this paper, the same reasoning and method can be applied to investigate predictive validity, and also external responsiveness, which is another important property of clinical assessments and an active area of clinical research.13, 29 For the latter, CCA can be used to determine how well changes in single and groups of walking assessments relate to changes in the external reference assessments (for example, I-LEMS).

Limitations and future work

Our results on concurrent validity are with respect to the selected motor scores as clinical references. These motor scores are based on the internationally recommended and endorsed ASIA protocol. Nonetheless, one should keep in mind that different reference assessments can yield different results, as seen in comparison with other studies.

The interpretation of CCA is slightly more involved, as it determines weighted linear combinations of walking assessments and I-LEMS scores. In practice, it is recommended not to focus on the coefficient vectors, but rather on the (Spearman) correlations between each original variable and the derived canonical variate (Supplementary Information).

We employed CCA based on rank correlations to take into account the ordinal nature of the data. CCA does, however, still construct a linear combination of ordinal variables. In future work, it may be interesting to investigate the use of fully nonparametric CCA in this context.

Finally, when interpreting the results, one has to keep in mind that we study two different patient subgroups—the less impaired patients and all observed patients. Within each stage and each group, we perform a complete case analysis, that is, we only consider patients without missing values. This leads to a loss of information, and possibly also to a bias, if the missingness is not at random. The amount of missingness varies across the stages. In future work, the use of imputation techniques should be investigated.

Conclusion

With the application of CCA, we could assess the concurrent validity of single and groups of walking assessments following acute SCI against lower extremities motor scores (I-LEMS and S-LEMS). Considering both timed and untimed walking assessments (patient subgroup I), we found that a group of only three walking assessments (SCIM3a, WISCI and 10MWT) essentially provided the same concurrent validity as all six walking assessments together, and without being redundant. In general, a combination of untimed and timed assessments achieved highest validity. Considering only untimed walking assessments (patient subgroup II), we found that WISCI is about as valid as all three untimed assessments together.

As clinical studies are aiming to use assessments of sufficient concurrent validity but with a limited burden to patients as well as hospital employees, our analyses provide guidance towards selecting a reduced but targeted number of walking assessments.

Data Archiving

There were no data to deposit.