We would like to thank S. Roedder and N. Salomonis for their comments on our News & Views article (Loaded, locked, drawn: kSORT validated for patient samples. Nat. Rev. Nephrol. http://dx.doi.org/10.1038/nrneph.2016.160)1, and we appreciate the opportunity to further address the points we raised (Biomarkers in transplantation — the devil is in the detail. Nat. Rev. Nephrol. 11, 204–205; 2015)2 regarding their paper, which described the development of an assay for detection of acute rejection in kidney transplant recipients3.
In our article, we pointed out that with the appropriate prevalence adjustments, the positive predictive value (PPV) of the Kidney Solid Organ Response Test (kSORT) for acute rejection and for subclinical acute rejection would be significantly reduced2. We note that Roedder and Salomonis agree with us that performance metrics including PPV depend on prevalence1. However, we are perplexed by their statement that “the true prevalence of acute rejection cannot be fairly determined,” and that the prevalence of subclinical acute rejection is also not defined because it “might be missed in patients who never undergo protocol biopsies”. The prevalence of acute rejection is well established4 (Fig. 1) and recent studies have further defined the prevalence of subclinical acute rejection5. The issue of suclinical acute rejection is of particular interest, as their data show a low sensitivity and modest C statistic of the kSORT assay for subclinical acute rejection3. We contend that unless this issue is addressed through further validation and refinement of the assay, the proposed models risk miscategorization of biomarkers.
We also commented on the necessity to lock down an algorithm to ensure reproducibility. An algorithm is a process — a set of rules to be followed in all subsequent tests — and not, as Roedder and Salomonis suggest, an assurance that the discovery set is locked. Most molecular diagnostic tests based on gene expression use a single fixed or locked model to predict a given phenotype. For example, the AlloMap test uses a single fixed model that comprises a linear discriminant analysis (LDA) method which fits a linear equation that measures the expression of 11 genes6. Similarly, a three-gene urinary PCR test uses logistic regression with backward elimination and bootstrap-resampling methods to derive a best-fitting fixed parsimonious model for prediction of rejection7.
In sharp contrast to the AlloMap and urinary PCR tests, kSORT uses 13 individual models, each consisting of 12–17 genes. Roedder and colleagues created an ensemble classifier of all 13 models using a commonly used voting procedure8,9,10,11, but they used an unorthodox application of this voting procedure without justification. Normally an ensemble classifier uses different algorithms (Support Vector Machines, random forests, LDA, etc.) on the discovery data to predict a phenotype, which gives the classifier an advantage by sampling the range of predictive capabilities of different algorithms9. Conversely, the kSORT analysis suite (kSAS) applies the same Pearson correlation-based algorithm repeatedly in a modified voting procedure that determines the final phenotype, suggesting that a single fixed and locked model capable of yielding acceptable predictive accuracy does not exist. The authors do not provide any clarity around the assumptions and performance metrics used to determine the best set of gene features and the number of models applied. Moreover, they provide little evidence that the test can accurately predict the correct phenotype when the sample is tested blind, a requirement for any clinical test. In fact, by their own account, the final tool required an intensive internal training process mixing investigator-selected samples of known phenotypes from multiple clinical sources, creating a potential for substantial imposed user bias in reporting performance. Finally, despite the authors' claims, we remain unable to find a description of the exact thresholds for the optimization of the correlation coefficients used in each individual model in the manuscript or supplemental data3.
Another remaining concern about kSORT is that indeterminate samples (classified as 'intermediate risk'; 15 of 100 samples) seem to have been left out of the area under the curve (AUC) calculations. Yet for indeterminate calls, one has to assume a low accuracy for samples below the established threshold. Including indeterminate calls in the calculations would substantially and negatively affect the AUC, resulting in a substantially lower AUC than the published values. In their correspondence1, the authors cite the AUC for subclinical acute rejection of 0.73 reported for the kSORT assay in the ESCAPE study12, which was co-authored by Roedder and Sarwal (senior author on the original kSORT study3), as evidence for the validity of kSORT. Sarwal recently stated, however, that “biomarker panels developed for graft rejection and tolerance in recent studies provide [receiver operating characteristic (ROC)] curves of >85%”, citing kSORT and the AART study3, and that the AUC “serves as estimated index of overall accuracy and serves a useful practice to compare different ROCs”13.
Although the correspondence from Roedder and Salomonis is a welcome addition to further this dialogue, we do not feel that their argument offers evidence of any “misunderstanding” or misrepresentation of kSORT or the KSAS algorithm in our original commentary2. Rather, their correspondence offers further technical information that is consistent with our original concerns, without providing additional and needed clarity. We do, however, enthusiastically applaud their ongoing validation studies and their efforts to better the lives of transplant recipients.
Change history
30 November 2016
In the html version of this article originally published online, Bruce Kaplan's current affiliation was incorrectly stated in his biography. This error has now been corrected.
References
Roedder, S. & Salomonis, N. Loaded, locked, drawn: kSORT validated for patient samples. Nat. Rev. Nephrol. http://dx.doi.org/10.1038/nrneph.2016.160 (2016).
Abecassis, M. & Kaplan, B. Transplantation: biomarkers in transplantation — the devil is in the detail. Nat. Rev. Nephrol. 11, 204–205 (2015).
Roedder, S. et al. The kSORT assay to detect renal transplant patients at high risk for acute rejection: results of the multicenter AART study. PLoS Med. 11, e1001759 (2014).
Moreso, F. et al. Subclinical rejection associated with chronic allograft nephropathy in protocol biopsies as a risk factor for late graft loss. Am. J. Transplant. 6, 747–752 (2006).
Loupy, A. et al. Subclinical rejection phenotypes at 1 year post-transplant and outcome of kidney allografts. J. Am. Soc. Nephrol. 26, 1721–1731 (2015).
Deng, M. C. et al. Noninvasive discrimination of rejection in cardiac allograft recipients using gene expression profiling. Am. J. Transplant. 6, 150–160 (2006).
Suthanthiran, M. et al. Urinary-cell mRNA profile and acute cellular rejection in kidney allografts. N. Engl. J. Med. 369, 20–31 (2013).
Dudoit, S. & Fridlyand, J. Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003).
Gunther, O. P. et al. A computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers. BMC Bioinformatics 13, 326 (2012).
Rokach, L. Ensemble-based classifiers. Artif. Intel. Rev. 33, 1–39 (2010).
Segovia, F. et al. Combining feature extraction methods to assist the diagnosis of Alzheimer's disease. Curr. Alzheimer Res. 13, 831–837 (2016).
Crespo, E. et al. Molecular and functional noninvasive immune monitoring in the ESCAPE study for prediction of subclinical renal allograft rejection. Transplantation http://dx.doi.org/10.1097/TP.0000000000001287, (2016).
Wang, A. & Sarwal, M. M. Computational models for transplant biomarker discovery. Front. Immunol. 6, 458 (2015).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
M.A. is co-founder and Chief Clinical Advisor of Transplant Genomics Incorporated. B.K. declares no competing interests.
PowerPoint slides
Rights and permissions
About this article
Cite this article
Abecassis, M., Kaplan, B. Caveat emptor: the devil is still in the detail. Nat Rev Nephrol 13, 60 (2017). https://doi.org/10.1038/nrneph.2016.161
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrneph.2016.161