Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A cost-aware framework for the development of AI models for healthcare applications

Abstract

Accurate artificial intelligence (AI) for disease diagnosis could lower healthcare workloads. However, when time or financial resources for gathering input data are limited, as in emergency and critical-care medicine, developing accurate AI models, which typically require inputs for many clinical variables, may be impractical. Here we report a model-agnostic cost-aware AI (CoAI) framework for the development of predictive models that optimize the trade-off between prediction performance and feature cost. By using three datasets, each including thousands of patients, we show that relative to clinical risk scores, CoAI substantially reduces the cost and improves the accuracy of predicting acute traumatic coagulopathy in a pre-hospital setting, mortality in intensive-care patients and mortality in outpatient settings. We also show that CoAI outperforms state-of-the-art cost-aware prediction strategies in terms of predictive performance, model cost, training time and robustness to feature-cost perturbations. CoAI uses axiomatic feature-attribution methods for the estimation of feature importance and decouples feature selection from model training, thus allowing for a faster and more flexible adaptation of AI models to new feature costs and prediction budgets.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of CoAI framework. Clinical features are annotated on the basis of two different sources.
Fig. 2: Summary of clinical datasets.
Fig. 3: CoAI improves prediction performance and model cost over existing clinical models and AI methods.
Fig. 4: CoAI provides improved robustness and training complexity over competitor methods.
Fig. 5: Importance of features selected by CoAI and other risk scores, ranked by order added to the model for CoAI and by regression coefficient or feature importance for others.

Similar content being viewed by others

Data availability

Two of our three datasets—the ICU and outpatient datasets—are publicly available. The ICU dataset was published in ref. 40 and is available from the MIT eICU Collaborative Research Database (https://eicu-crd.mit.edu/gettingstarted/overview/) but requires approval before download. The outpatient dataset is a subset of the NHANES I study (ref. 22) and was published in its current format in ref. 32. It is also uploaded to our GitHub repository (https://github.com/suinleelab/coai) along with our code. The trauma dataset is not publicly available owing to patient privacy concerns.

Code availability

Code implementing CoAI is available at https://github.com/suinleelab/coai. The repository also includes notebooks reproducing the results that do not rely on the trauma dataset, including performance and feature importance for CoAI and existing mortality risk scores on the ICU dataset, and comparisons with existing low-cost AI methods on the outpatient dataset.

References

  1. MDCalc. Frequently Asked Questions https://www.mdcalc.com/faq (2019).

  2. Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).

    Article  CAS  Google Scholar 

  3. Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402–2410 (2016).

    Article  Google Scholar 

  4. Hannun, A. Y. et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 25, 65–69 (2019).

    Article  CAS  Google Scholar 

  5. Lipton, Z. C., Kale, D. C., Elkan, C. & Wetzel, R. Learning to diagnose with LSTM ecurrent neural networks. In 4th International Conference on Learning Representations (ICLR, 2016).

  6. Lundberg, S. M. et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat. Biomed. Eng. 2, 749–760 (2018).

    Article  Google Scholar 

  7. Trauma In Washington State: A Chart Report of the First 15 Years, 1995–2009 (Washington State Department of Health, 2011).

  8. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar 

  9. Peter, S., Diego, F., Hamprecht, F. A. & Nadler, B. in Advances in Neural Information Processing Systems (eds Guyon, I. et al.) 1551–1561 (NIPS, 2017).

  10. Janisch, J., Pevny`, T. & Lisy`, V. Classification with costly features using deep reinforcement learning. In Proc. AAAI Conference on Artificial Intelligence Vol. 33, 3959–3966 (AAAI Press, 2019).

  11. Janisch, J., Pevny`, T. & Lisy`, V. Classification with costly features as a sequential decision-making problem. Mach. Learn. 109, 1587–1615 (2020).

    Article  Google Scholar 

  12. Frith, D. et al. Definition and drivers of acute traumatic coagulopathy: clinical and experimental investigations. J. Thromb. Haemost. 8, 1919–1925 (2010).

    Article  CAS  Google Scholar 

  13. Mitra, B., Cameron, P. A., Mori, A. & Fitzgerald, M. Acute coagulopathy and early deaths post major trauma. Injury 43, 22–25 (2012).

    Article  Google Scholar 

  14. Brohi, K., Cohen, M. J. & Davenport, R. A. Acute coagulopathy of trauma: mechanism, identification and effect. Curr. Opin. Crit. Care 13, 680–685 (2007).

    Article  Google Scholar 

  15. Gando, S. & Hayakawa, M. Pathophysiology of trauma-induced coagulopathy and management of critical bleeding requiring massive transfusion Semin. Thromb. Hemost. 42, 155–165 (2016).

  16. Davenport, R. et al. Functional definition and characterisation of acute traumatic coagulopathy. Crit. Care Med. 39, 2652–2658 (2011).

    Article  Google Scholar 

  17. Peltan, I. D. et al. Development and validation of a prehospital prediction model for acute traumatic coagulopathy. Crit. Care 20, 371 (2016).

    Article  Google Scholar 

  18. Mitra, B. et al. Early prediction of acute traumatic coagulopathy. Resuscitation 82, 1208–1213 (2011).

    Article  Google Scholar 

  19. Halpern, N. Critical Care Statistics (Society of Critical Care Medicine, 2019); https://www.sccm.org/Communications/Critical-Care-Statistics

  20. Johnson, A. E., Kramer, A. A. & Clifford, G. D. A new severity of illness scale using a subset of acute physiology and chronic health evaluation data elements shows comparable predictive accuracy. Crit. Care Med. 41, 1711–1718 (2013).

    Article  Google Scholar 

  21. Seymour, C. W. et al. Assessment of clinical criteria for sepsis: for the third international consensus definitions for sepsis and septic shock (sepsis-3). JAMA 315, 762–774 (2016).

    Article  CAS  Google Scholar 

  22. Miller, H. W. Plan and Operation of the Health and Nutrition Examination Survey, United States, 1971–1973 (Department of Health, Education and Welfare, 1973).

  23. Christakis, N. A. & Iwashyna, T. J. Attitude and self-reported practice regarding prognostication in a national sample of internists. Arch. Intern. Med. 158, 2389–2395 (1998).

    Article  CAS  Google Scholar 

  24. Rui, P. & Okeyode, T. National Ambulatory Medical Care Survey: 2016 National Summary Tables (National Center for Health Statistics, 2016).

  25. Lee, S. J., Lindquist, K., Segal, M. R. & Covinsky, K. E. Development and validation of a prognostic index for 4-year mortality in older adults. JAMA 295, 801–808 (2006).

    Article  CAS  Google Scholar 

  26. du Bois, R. M. et al. Ascertainment of individual risk of mortality for patients with idiopathic pulmonary fibrosis. Am. J. Respir. Crit. Care Med. 184, 459–466 (2011).

    Article  Google Scholar 

  27. Celli, B. R. et al. The body-mass index, airflow obstruction, dyspnea, and exercise capacity index in chronic obstructive pulmonary disease. N. Engl. J. Med. 350, 1005–1012 (2004).

    Article  CAS  Google Scholar 

  28. Vazirani, V. V. Approximation Algorithms (Springer Science & Business Media, 2013).

  29. Perron, L. & Furnon, V. Or-tools 7.2 (Google, 2019); https://developers.google.com/optimization/

  30. Covert, I. & Lee, S. I. Improving KernelSHAP: practical Shapley value estimation using linear regression. In International Conference on Artificial Intelligence and Statistics 3457–3465 (PMLR, 2021).

  31. Covert, I., Lundberg, S. & Lee, S.-I. Understanding global feature contributions with additive importance measures. In Advances in Neural Information Processing Systems 17212–17223 (NeurIPS, 2020).

  32. Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).

    Article  Google Scholar 

  33. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017).

  34. Chen, H., Lundberg, S. & Lee, S.-I. Explaining models by propagating Shapley values of local components. Preprint at https://arxiv.org/abs/1911.11888v1 (2019).

  35. Saar-Tsechansky, M. & Provost, F. Handling missing values when applying classification models. J. Mach. Learn. Res. 8, 1623–1657 (2007).

    Google Scholar 

  36. Li, K. et al. A machine learning–based model to predict acute traumatic coagulopathy in trauma patients upon emergency hospitalization. Clin. Appl. Thromb. Hemost. 26, 1076029619897827 (2020).

    Article  CAS  Google Scholar 

  37. Nunez, T. C. et al. Early prediction of massive transfusion in trauma: simple as abc (assessment of blood consumption)? J. Trauma Acute Care Surg. 66, 346–352 (2009).

    Article  Google Scholar 

  38. Buuren, S. V., & Groothuis-Oudshoorn, K. mice: multivariate imputation by chained equations in R. J. Stat. Softw. 45, 1–68 (2010).

    Google Scholar 

  39. Wheeler, A. R. et al. Development of prehospital assessment findings associated with massive transfusion. Transfusion. 60, S70–S76 (2020).

    Article  Google Scholar 

  40. Pollard, T. J. et al. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Sci. Data 5, 180178 (2018).

    Article  Google Scholar 

  41. Ke, G. et al. Lightgbm: a highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 30, 3146–3154 (2017).

  42. Clinical Laboratory Fee Schedule Files - Cy 2019 Q3 Release (Centers for Medicare and Medicaid Services, 2019); https://cms.gov/Medicare/Medicare-Fee-for-Service-Payment/ClinicalLabFeeSched/Clinical-Laboratory-Fee-Schedule-Files.html

  43. Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 3319–3328 (PMLR, 2017).

  44. Friedman, J. H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).

    Article  Google Scholar 

  45. Nelder, J. A. & Wedderburn, R. W. Generalized linear models. J. R. Stat. Soc. A 135, 370–384 (1972).

    Article  Google Scholar 

  46. Hastie, T. et al. The Elements of Statistical Learning Vol. 1 (Springer, 2001).

  47. Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45166.pdf (2015).

  48. Chollet, F. et al. Keras https://keras.io (2015).

  49. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996).

    Google Scholar 

  50. Early, K., Fienberg, S. E. & Mankoff, J. Test time feature ordering with focus: interactive predictions with minimal user burden. In Proc. 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing 992–1003 (ACM, 2016).

  51. Nan, F., Wang, J. & Saligrama, V. Pruning random forests for prediction on a budget. Adv. Neural Inf. Proc. Syst. 29, 2334–2342 (2016).

    Google Scholar 

  52. Nan, F. & Saligrama, V. Adaptive classification for prediction under a budget. Adv. Neural Inf. Proc. Syst. 30, 4727–4737 (2017).

    Google Scholar 

  53. Peng, Y.-S., Tang, K.-F., Lin, H.-T. & Chang, E. Refuel: exploring sparse features in deep reinforcement learning for fast disease diagnosis. Adv. Neural Inf. Proc. Syst. 31, 7322–7331 (2018).

    Google Scholar 

  54. Kachuee, M., Goldstein, O., Kärkkäinen, K., Darabi, S. & Sarrafzadeh, M. Opportunistic learning: budgeted cost-sensitive learning from data streams. In 7th International Conference on Learning Representations (ICLR) 2019 (OpenReview.net, 2019); https://openreview.net/forum?id=S1eOHo09KX/

  55. Kachuee, M., Karkkainen, K., Goldstein, O., Zamanzadeh, D. & Sarrafzadeh, M. Cost-sensitive diagnosis and learning leveraging public health data. Preprint at https://arxiv.org/abs/1902.07102v2 (2019).

Download references

Acknowledgements

This work was funded by the National Science Foundation (CAREER DBI-1552309 and DBI-1759487), the American Cancer Society (127332-RSG-15-097-01-TBG) and the National Institutes of Health (F30 HL 151074, R35 GM 128638 and R01 NIA AG 061132). We thank S. Lundberg, I. Covert and the Lee Lab for helpful discussions about this paper.

Author information

Authors and Affiliations

Authors

Contributions

G.E., J.D.J., N.J.W. and S.-I.L. conceived and designed the study. G.E. performed experiments and created figures. G.E., C.H., R.B.U., A.M.M., M.R.S. and N.J.W. implemented EMS surveys and gathered feature costs. G.E., J.D.J., N.J.W. and S.-I.L. wrote the manuscript. S.-I.L. and N.J.W. jointly supervised the research. S.-I.L. secured the funding for the study.

Corresponding authors

Correspondence to Nathan J. White or Su-In Lee.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Biomedical Engineering thanks Omar Badawi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Erion, G., Janizek, J.D., Hudelson, C. et al. A cost-aware framework for the development of AI models for healthcare applications. Nat. Biomed. Eng 6, 1384–1398 (2022). https://doi.org/10.1038/s41551-022-00872-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41551-022-00872-8

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics