Abstract
Accurate artificial intelligence (AI) for disease diagnosis could lower healthcare workloads. However, when time or financial resources for gathering input data are limited, as in emergency and critical-care medicine, developing accurate AI models, which typically require inputs for many clinical variables, may be impractical. Here we report a model-agnostic cost-aware AI (CoAI) framework for the development of predictive models that optimize the trade-off between prediction performance and feature cost. By using three datasets, each including thousands of patients, we show that relative to clinical risk scores, CoAI substantially reduces the cost and improves the accuracy of predicting acute traumatic coagulopathy in a pre-hospital setting, mortality in intensive-care patients and mortality in outpatient settings. We also show that CoAI outperforms state-of-the-art cost-aware prediction strategies in terms of predictive performance, model cost, training time and robustness to feature-cost perturbations. CoAI uses axiomatic feature-attribution methods for the estimation of feature importance and decouples feature selection from model training, thus allowing for a faster and more flexible adaptation of AI models to new feature costs and prediction budgets.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$99.00 per year
only $8.25 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Two of our three datasets—the ICU and outpatient datasets—are publicly available. The ICU dataset was published in ref. 40 and is available from the MIT eICU Collaborative Research Database (https://eicu-crd.mit.edu/gettingstarted/overview/) but requires approval before download. The outpatient dataset is a subset of the NHANES I study (ref. 22) and was published in its current format in ref. 32. It is also uploaded to our GitHub repository (https://github.com/suinleelab/coai) along with our code. The trauma dataset is not publicly available owing to patient privacy concerns.
Code availability
Code implementing CoAI is available at https://github.com/suinleelab/coai. The repository also includes notebooks reproducing the results that do not rely on the trauma dataset, including performance and feature importance for CoAI and existing mortality risk scores on the ICU dataset, and comparisons with existing low-cost AI methods on the outpatient dataset.
References
MDCalc. Frequently Asked Questions https://www.mdcalc.com/faq (2019).
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402–2410 (2016).
Hannun, A. Y. et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 25, 65–69 (2019).
Lipton, Z. C., Kale, D. C., Elkan, C. & Wetzel, R. Learning to diagnose with LSTM ecurrent neural networks. In 4th International Conference on Learning Representations (ICLR, 2016).
Lundberg, S. M. et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat. Biomed. Eng. 2, 749–760 (2018).
Trauma In Washington State: A Chart Report of the First 15 Years, 1995–2009 (Washington State Department of Health, 2011).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Peter, S., Diego, F., Hamprecht, F. A. & Nadler, B. in Advances in Neural Information Processing Systems (eds Guyon, I. et al.) 1551–1561 (NIPS, 2017).
Janisch, J., Pevny`, T. & Lisy`, V. Classification with costly features using deep reinforcement learning. In Proc. AAAI Conference on Artificial Intelligence Vol. 33, 3959–3966 (AAAI Press, 2019).
Janisch, J., Pevny`, T. & Lisy`, V. Classification with costly features as a sequential decision-making problem. Mach. Learn. 109, 1587–1615 (2020).
Frith, D. et al. Definition and drivers of acute traumatic coagulopathy: clinical and experimental investigations. J. Thromb. Haemost. 8, 1919–1925 (2010).
Mitra, B., Cameron, P. A., Mori, A. & Fitzgerald, M. Acute coagulopathy and early deaths post major trauma. Injury 43, 22–25 (2012).
Brohi, K., Cohen, M. J. & Davenport, R. A. Acute coagulopathy of trauma: mechanism, identification and effect. Curr. Opin. Crit. Care 13, 680–685 (2007).
Gando, S. & Hayakawa, M. Pathophysiology of trauma-induced coagulopathy and management of critical bleeding requiring massive transfusion Semin. Thromb. Hemost. 42, 155–165 (2016).
Davenport, R. et al. Functional definition and characterisation of acute traumatic coagulopathy. Crit. Care Med. 39, 2652–2658 (2011).
Peltan, I. D. et al. Development and validation of a prehospital prediction model for acute traumatic coagulopathy. Crit. Care 20, 371 (2016).
Mitra, B. et al. Early prediction of acute traumatic coagulopathy. Resuscitation 82, 1208–1213 (2011).
Halpern, N. Critical Care Statistics (Society of Critical Care Medicine, 2019); https://www.sccm.org/Communications/Critical-Care-Statistics
Johnson, A. E., Kramer, A. A. & Clifford, G. D. A new severity of illness scale using a subset of acute physiology and chronic health evaluation data elements shows comparable predictive accuracy. Crit. Care Med. 41, 1711–1718 (2013).
Seymour, C. W. et al. Assessment of clinical criteria for sepsis: for the third international consensus definitions for sepsis and septic shock (sepsis-3). JAMA 315, 762–774 (2016).
Miller, H. W. Plan and Operation of the Health and Nutrition Examination Survey, United States, 1971–1973 (Department of Health, Education and Welfare, 1973).
Christakis, N. A. & Iwashyna, T. J. Attitude and self-reported practice regarding prognostication in a national sample of internists. Arch. Intern. Med. 158, 2389–2395 (1998).
Rui, P. & Okeyode, T. National Ambulatory Medical Care Survey: 2016 National Summary Tables (National Center for Health Statistics, 2016).
Lee, S. J., Lindquist, K., Segal, M. R. & Covinsky, K. E. Development and validation of a prognostic index for 4-year mortality in older adults. JAMA 295, 801–808 (2006).
du Bois, R. M. et al. Ascertainment of individual risk of mortality for patients with idiopathic pulmonary fibrosis. Am. J. Respir. Crit. Care Med. 184, 459–466 (2011).
Celli, B. R. et al. The body-mass index, airflow obstruction, dyspnea, and exercise capacity index in chronic obstructive pulmonary disease. N. Engl. J. Med. 350, 1005–1012 (2004).
Vazirani, V. V. Approximation Algorithms (Springer Science & Business Media, 2013).
Perron, L. & Furnon, V. Or-tools 7.2 (Google, 2019); https://developers.google.com/optimization/
Covert, I. & Lee, S. I. Improving KernelSHAP: practical Shapley value estimation using linear regression. In International Conference on Artificial Intelligence and Statistics 3457–3465 (PMLR, 2021).
Covert, I., Lundberg, S. & Lee, S.-I. Understanding global feature contributions with additive importance measures. In Advances in Neural Information Processing Systems 17212–17223 (NeurIPS, 2020).
Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017).
Chen, H., Lundberg, S. & Lee, S.-I. Explaining models by propagating Shapley values of local components. Preprint at https://arxiv.org/abs/1911.11888v1 (2019).
Saar-Tsechansky, M. & Provost, F. Handling missing values when applying classification models. J. Mach. Learn. Res. 8, 1623–1657 (2007).
Li, K. et al. A machine learning–based model to predict acute traumatic coagulopathy in trauma patients upon emergency hospitalization. Clin. Appl. Thromb. Hemost. 26, 1076029619897827 (2020).
Nunez, T. C. et al. Early prediction of massive transfusion in trauma: simple as abc (assessment of blood consumption)? J. Trauma Acute Care Surg. 66, 346–352 (2009).
Buuren, S. V., & Groothuis-Oudshoorn, K. mice: multivariate imputation by chained equations in R. J. Stat. Softw. 45, 1–68 (2010).
Wheeler, A. R. et al. Development of prehospital assessment findings associated with massive transfusion. Transfusion. 60, S70–S76 (2020).
Pollard, T. J. et al. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Sci. Data 5, 180178 (2018).
Ke, G. et al. Lightgbm: a highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 30, 3146–3154 (2017).
Clinical Laboratory Fee Schedule Files - Cy 2019 Q3 Release (Centers for Medicare and Medicaid Services, 2019); https://cms.gov/Medicare/Medicare-Fee-for-Service-Payment/ClinicalLabFeeSched/Clinical-Laboratory-Fee-Schedule-Files.html
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 3319–3328 (PMLR, 2017).
Friedman, J. H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).
Nelder, J. A. & Wedderburn, R. W. Generalized linear models. J. R. Stat. Soc. A 135, 370–384 (1972).
Hastie, T. et al. The Elements of Statistical Learning Vol. 1 (Springer, 2001).
Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45166.pdf (2015).
Chollet, F. et al. Keras https://keras.io (2015).
Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996).
Early, K., Fienberg, S. E. & Mankoff, J. Test time feature ordering with focus: interactive predictions with minimal user burden. In Proc. 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing 992–1003 (ACM, 2016).
Nan, F., Wang, J. & Saligrama, V. Pruning random forests for prediction on a budget. Adv. Neural Inf. Proc. Syst. 29, 2334–2342 (2016).
Nan, F. & Saligrama, V. Adaptive classification for prediction under a budget. Adv. Neural Inf. Proc. Syst. 30, 4727–4737 (2017).
Peng, Y.-S., Tang, K.-F., Lin, H.-T. & Chang, E. Refuel: exploring sparse features in deep reinforcement learning for fast disease diagnosis. Adv. Neural Inf. Proc. Syst. 31, 7322–7331 (2018).
Kachuee, M., Goldstein, O., Kärkkäinen, K., Darabi, S. & Sarrafzadeh, M. Opportunistic learning: budgeted cost-sensitive learning from data streams. In 7th International Conference on Learning Representations (ICLR) 2019 (OpenReview.net, 2019); https://openreview.net/forum?id=S1eOHo09KX/
Kachuee, M., Karkkainen, K., Goldstein, O., Zamanzadeh, D. & Sarrafzadeh, M. Cost-sensitive diagnosis and learning leveraging public health data. Preprint at https://arxiv.org/abs/1902.07102v2 (2019).
Acknowledgements
This work was funded by the National Science Foundation (CAREER DBI-1552309 and DBI-1759487), the American Cancer Society (127332-RSG-15-097-01-TBG) and the National Institutes of Health (F30 HL 151074, R35 GM 128638 and R01 NIA AG 061132). We thank S. Lundberg, I. Covert and the Lee Lab for helpful discussions about this paper.
Author information
Authors and Affiliations
Contributions
G.E., J.D.J., N.J.W. and S.-I.L. conceived and designed the study. G.E. performed experiments and created figures. G.E., C.H., R.B.U., A.M.M., M.R.S. and N.J.W. implemented EMS surveys and gathered feature costs. G.E., J.D.J., N.J.W. and S.-I.L. wrote the manuscript. S.-I.L. and N.J.W. jointly supervised the research. S.-I.L. secured the funding for the study.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Biomedical Engineering thanks Omar Badawi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Main Supplementary Information
Supplementary figures and survey documents.
Rights and permissions
About this article
Cite this article
Erion, G., Janizek, J.D., Hudelson, C. et al. A cost-aware framework for the development of AI models for healthcare applications. Nat. Biomed. Eng 6, 1384–1398 (2022). https://doi.org/10.1038/s41551-022-00872-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41551-022-00872-8
This article is cited by
-
Construction and optimization of multi-platform precision pathways for precision medicine
Scientific Reports (2024)
-
A roadmap for the development of human body digital twins
Nature Reviews Electrical Engineering (2024)
-
Interpretable multi-hop knowledge reasoning for gastrointestinal disease
Annals of Operations Research (2023)
-
Scientific discovery in the age of artificial intelligence
Nature (2023)