Introduction

Peripheral artery disease (PAD) is a chronic atherosclerotic disorder that primarily causes decreased perfusion to the lower extremities, manifesting in claudication, rest pain, and tissue loss1. Affecting over 200 million people worldwide, PAD is a major contributor to decreased quality of life, rising health care costs, limb loss, and death2,3,4,5. Lower extremity open revascularization is a surgical treatment option for PAD that has been recently demonstrated in the BEST-CLI trial to achieve superior outcomes compared to endovascular therapy for chronic limb threatening ischemia (CLTI) in patients with an adequate great saphenous vein conduit6. Nevertheless, open revascularization carries a high risk of complications, with major adverse limb event (MALE) or death occurring in over 40% of the surgical group in the BEST-CLI trial after a median follow-up of 2.7 years6. Others have shown that over 30% of patients will suffer a major adverse event within 30 days following lower extremity bypass7. As a result, the Global Vascular Guidelines recommend careful assessment of surgical risk when considering patients for revascularization8.

There are currently no widely used clinical tools to predict adverse events following lower extremity open revascularization. In the research setting, current models are limited to trauma patients9, Japanese and Finnish cohorts10,11, and prediction of groin wound infections12. Furthermore, tools such as the American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP) surgical risk calculator13 and Vascular Study Group of New England (VSGNE) Cardiac Risk Index (CRI)14 use modelling techniques that require manual input of clinical variables, which deters routine use in busy medical settings15. Therefore, there is an important need to develop better and more practical surgical risk prediction tools that overcome existing limitations with automated modelling techniques, inclusion of more geographically diverse cohorts with atherosclerotic disease, and assessment of more clinically relevant outcomes such as MALE or death.

Machine learning (ML) is a rapidly advancing technology that allows computers to learn from data and make predictions16. This field has been driven by the explosion of electronic medical record data combined with increasing computational power17. Previously, ML has been applied to the ACS NSQIP database to develop algorithms that predict peri-operative complications in a pooled dataset of over 2900 unique procedures, including patients undergoing day surgery to those requiring intensive care unit admission18. Given that this cohort represents a heterogeneous surgical population, better predictive performance may be achieved by developing ML algorithms specific to patients undergoing lower extremity open revascularization. In this study, ML was applied to the ACS NSQIP database to predict 30-day MALE or death and other outcomes following lower extremity open revascularization using pre-operative data.

Methods

Ethics

All methods were carried out in accordance with the World Medical Association Declaration of Helsinki19. Institutional research ethics board review and informed patient consent were not required as the data came from a large, deidentified registry, which is an accepted practice for studies based on ACS NSQIP data20.

Design

We conducted a multicenter retrospective cohort ML-based prognostic study and findings were reported based on the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement21.

Dataset

Created in 2004, the ACS NSQIP database contains demographic, clinical, and 30-day outcomes data on surgical patients across over 700 hospitals in approximately 15 countries worldwide22. The information is prospectively collected from electronic health records by trained and certified clinical reviewers and regularly audited by ACS for accuracy23. In 2011, targeted NSQIP registries for vascular operations were developed by vascular surgeons, which contain additional procedure-specific variables and outcomes24.

Cohort

All patients who underwent scheduled and unscheduled lower extremity infrainguinal open revascularization for chronic atherosclerotic disease between 2011 and 2021 in the ACS NSQIP targeted vascular database were included. This information was merged with the main ACS NSQIP database for a complete set of generic and procedure-specific variables and outcomes. Patients treated for lower extremity aneurysmal disease, acute limb ischemia, trauma, dissection, or malignancy, as well as those with unreported symptom status (CLTI, claudication, or asymptomatic) or undergoing concurrent major amputation were excluded.

Features

Thirty-seven pre-operative variables were used as input features for our ML models. Demographic variables included age, sex, body mass index, race, ethnicity, and origin status. Comorbidities included hypertension, diabetes, smoking status, congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD), end stage renal disease (ESRD) requiring dialysis, functional status, and physiologic high-risk factor [defined as at least one of the following: (1) end stage renal disease, (2) age > 80, (3) New York Heart Association CHF class III/IV, (4) left ventricular ejection fraction < 30%, (5) unstable angina within 30 days prior to surgery, or (6) myocardial infarction (MI) within 30 days prior to surgery]. Medications included antiplatelets, statins, and beta blockers. Pre-operative laboratory investigations included serum sodium, blood urea nitrogen (BUN), serum creatinine, albumin, white blood cell count, hematocrit, platelet count, international normalized unit (INR), and partial thromboplastin time (PTT). Limb hemodynamics based on ankle brachial index (ABI), toe pressure, and palpability of pedal pulses, as well as anatomic high-risk factors (defined as a prior bypass or endovascular intervention involving the currently treated segment) were recorded. Concurrent procedures recorded included minor amputation (below the ankle) and endovascular iliac or infrainguinal revascularization. Other pre-procedural characteristics recorded were symptom status [asymptomatic, claudication, or CLTI (defined as rest pain or tissue loss)], primary procedure including inflow, outflow, and conduit, urgency of surgery (elective, urgent, or emergent), and American Society of Anesthesiologists (ASA) class. A complete list of features and definitions can be found in Supplementary Table 1.

Outcomes

The primary outcome was 30-day MALE (composite of untreated loss of patency, major reintervention, or major amputation) or death. Untreated loss of patency was defined as a loss of graft patency on imaging or physical exam with no subsequent open or endovascular revascularization procedure. Major reintervention was defined as a new or revision lower extremity bypass, interposition graft revision, or bypass graft thrombectomy/thrombolysis. Major amputation was defined as a transtibial or more proximal amputation on the ipsilateral leg. Death was defined as all-cause mortality. This composite outcome was chosen because it is frequently reported as a primary outcome in landmark studies, including the BEST-CLI trial6.

Thirty-day secondary outcomes included individual components of the primary outcome, major adverse cardiovascular event (MACE), individual components of MACE, wound complication, bleeding requiring transfusion or secondary procedure, other morbidity, non-home discharge, and unplanned readmission. MACE was defined as a composite of MI (ischemic electrocardiogram changes, troponin elevation, or physician/advanced provider diagnosis), stroke (motor, sensory, or cognitive dysfunction persisting for > 24 h in the setting of suspected stroke), or death. Wound complication was defined as a non-healing or open wound at the surgical incision, dehiscence, or cellulitis. Other morbidity was defined as a composite of pneumonia, unplanned reintubation, pulmonary embolism (PE), failure to wean from ventilator (cumulative time of ventilator-assisted respirations > 48 h), acute kidney injury (AKI; rise in creatinine of > 2 mg/dL from pre-operative value or requirement of dialysis in a patient who did not require dialysis pre-operatively), urinary tract infection (UTI), cardiac arrest, deep vein thrombosis (DVT) requiring therapy, Clostridium difficile infection, sepsis, or septic shock. Non-home discharge was defined as discharge to rehabilitation, skilled care, or other facility. These outcomes are defined by the ACS NSQIP data dictionary25.

Model development

Six ML models were trained to predict 30-day primary and secondary outcomes: Extreme Gradient Boosting (XGBoost), random forest, Naïve Bayes classifier, radial basis function (RBF) support vector machine (SVM), multilayer perceptron (MLP) artificial neural network (ANN) with a single hidden layer, sigmoid activation function, and cross-entropy loss function, and logistic regression. These were chosen because they demonstrate the best performance for predicting surgical outcomes in the literature26,27,28. XGBoost is a gradient-boosted decision-tree-based ensemble model that is highly effective at regression and classification predictive modelling29. Random forest is an ensemble learning method that operates through multiple decision trees30. Naïve Bayes classifiers apply Bayes’ theorem to generate highly accurate predictions in high-dimensional datasets31. SVM’s can find hyperplanes in dimensional space to distinctly separate data points and achieve binary classification32. Neural networks resemble biological neurons and consist of an input, hidden, and output layer, capable of making meaningful predictions from complex information33. Logistic regression is a traditional statistical method used to model the relationship between independent and dependent variables, assuming a linear correlation between the predictors and logit of the outcome, as well as a lack of multicollinearity between explanatory variables34. The advantage of newer ML techniques over logistic regression is that they apply more advanced analytics to better model complex, multicollinear relationships between predictors and outcomes35. Nonlinear associations are common in health care data as patient trajectories are often influenced by many clinical, demographic, and systems-level factors36. Logistic regression was therefore used as the baseline comparator to assess relative model performance because it is the most common modelling technique used in traditional risk prediction tools37.

Our data were split into training (70%) and test (30%) sets38. Ten-fold cross-validation and grid search were performed on the training set to find optimal hyperparameters for each ML model39,40. Preliminary analysis of our data demonstrated that the primary outcome was uncommon, occurring in 2349/24,309 (9.7%) of patients in our cohort. To improve class balance, Random Over-Sample Examples (ROSE) was applied41. ROSE uses smoothed bootstrapping to draw new samples from the feature space neighbourhood around the minority class and is a commonly used method to support predictive modelling of rare events41. The models were then evaluated on test set data and ranked based on the primary discriminatory metric of AUROC. Our best performing model was XGBoost, which had the following optimized hyperparameters for our dataset: number of rounds = 100, maximum tree depth = 6, learning rate = 0.3, gamma = 0, column sample by tree = 1, minimum child weight = 1, subsample = 1. The process for selecting these hyperparameters through grid search and cross validation is detailed in Supplementary Table 2. Once we identified XGBoost as the best performing ML model for the primary outcome, we trained the algorithm to predict secondary outcomes.

Statistical analysis

Baseline demographic and clinical characteristics for patients with vs. without 30-day MALE or death were summarized as means (standard deviation), medians (interquartile range), or number (proportion). Differences in characteristics between outcome groups were assessed using independent t-test for continuous variables or chi-square test for categorical variables. Statistical significance was set at two-tailed p < 0.05.

The primary metric for assessing model performance was AUROC (95% CI), a validated method to assess discriminatory ability that considers both sensitivity and specificity42. Secondary performance metrics were accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). To further assess model performance, we plotted a calibration curve and calculated the Brier score, a measurement of the agreement between predicted and observed event probabilities43. In the final model, feature importance was determined by ranking the top 10 predictors based on the variable importance score (gain), a measure of the relative impact of individual covariates in contributing to an overall prediction44. Feature importance was determined for the overall cohort, CLTI patients, and asymptomatic/claudication groups. To assess model robustness on various populations, we performed subgroup analysis of predictive performance based on age (under vs. over 70 years), sex (male vs. female), race (White vs. non-White), ethnicity (Hispanic vs. Non-Hispanic), symptom status (CLTI vs. asymptomatic/claudication), procedure type (femoropopliteal bypass vs. femoral to tibial/pedal bypass vs. popliteal to tibial/pedal bypass vs. femoral endarterectomy/profundoplasty), and urgency (urgent/emergent vs. elective).

Based on a validated sample size calculator for clinical prediction models, to achieve a minimum AUROC of 0.7 with an outcome rate of ~ 10% and 37 input features, the minimum sample size required is 6960 patients with 696 events45,46. Our cohort of 24,309 patients with 2349 primary events meets this sample size requirement. There was less than 5% missing data for variables of interest; therefore, complete-case analysis was applied whereby only non-missing covariates for each patient were considered47. This has been demonstrated to be a valid analytical method for datasets with small amounts of missing data (< 5%) and reflects predictive modelling of real-world data, which inherently includes missing information48,49. All analyses were performed in R version 4.2.150 with the following packages: caret51, xgboost52, ranger53, naivebayes54, e107155, nnet56, and pROC57.

Results

Patients and events

From an initial cohort of 25,318 patients who underwent lower extremity open revascularization in the NSQIP targeted database between 2011 and 2021, we excluded 1,009 patients for the following reasons: treatment for lower extremity aneurysmal disease (n = 669), acute limb ischemia (n = 24), trauma (n = 4), or dissection (n = 2), unreported symptom status (n = 306), and concurrent major amputation (n = 4). Overall, we included 24,309 patients. The primary outcome of 30-day MALE or death occurred in 2349 (9.3%) patients. The 30-day secondary outcomes occurred in the following distribution: untreated loss of patency (n = 457 [1.9%]), major reintervention (n = 1,11 [4.9%]), major amputation (n = 689 [2.8%]), death (n = 547 [2.3%]), MACE (n = 1,346 [5.5%]), MI (n = 771 [3.2%]), stroke (n = 225 [0.9%]), wound complication (n = 3241 [13.3%]), bleeding requiring transfusion or secondary procedure (n = 4041 [16.6%]), other morbidity (n = 1799 [7.4%]; composite of pneumonia (n = 343), unplanned reintubation (n = 380), PE (n = 62), failure to wean from ventilator (n = 223), AKI (n = 123), UTI (n = 328), cardiac arrest (n = 230), DVT (n = 185), sepsis (n = 475), septic shock (n = 190), Clostridium difficile infection (n = 85)), non-home discharge (n = 6954 [28.6%]), and unplanned readmission (n = 3621 [14.9%]).

Pre-operative demographic and clinical characteristics

Compared to patients without a primary outcome, those who developed 30-day MALE or death were older and more likely to be female, Black, Hispanic, and transferred from another hospital, with a greater proportion residing in nursing homes. They were also more likely to have insulin dependent diabetes, CHF, ESRD requiring dialysis, and at least 1 physiologic high-risk factor. Functionally, patients with 30-day MALE or death were more likely to be partially or totally dependent. Despite being at higher cardiovascular risk, patients with 30-day MALE or death were less likely to receive antiplatelets. Notable differences in laboratory investigations included patients with 30-day MALE or death having higher levels of creatinine and BUN. Patients with a primary outcome were more likely to have an ABI ≤ 0.39 and a previous bypass or endovascular intervention involving the currently treated segment, with a greater proportion undergoing a concurrent minor amputation or endovascular infrainguinal revascularization. Patients with 30-day MALE or death were more likely to have CLTI, undergo a bypass to a tibial/pedal target, receive urgent/emergent surgery, and be ASA class 4 or higher (Table 1).

Table 1 Pre-operative demographic and clinical characteristics of patients undergoing lower extremity open revascularization with and without major adverse limb event or death at 30 days.

Model performance

Of the 6 ML models evaluated on test set data for predicting 30-day MALE or death following lower extremity open revascularization, XGBoost had the best performance with an AUROC (95% CI) of 0.93 (0.92–0.94) compared to random forest [0.92 (0.91–0.93)], Naïve Bayes [0.87 (0.86–0.88)], RBF SVM [0.85 (0.84–0.86)], MLP ANN [0.80 (0.78–0.82)], and logistic regression [0.63 (0.61–0.65)]. The other performance metrics of XGBoost were the following: accuracy 0.86 (95% CI 0.85–0.87), sensitivity 0.84, specificity 0.89, PPV 0.90, and NPV 0.83 (Table 2).

Table 2 Model performance on test set data for predicting 30-day major adverse limb event or death following lower extremity open revascularization using pre-operative features.

For 30-day secondary outcomes, XGBoost achieved the following AUROC’s (95% CI): untreated loss of patency [0.90 (0.89–0.91)], major reintervention [0.91 (0.89–0.93)], major amputation [0.95 (0.94–0.96)], death [0.96 (0.95–0.96)], MACE [0.93 (0.92–0.94)], MI [0.88 (0.87–0.89)], stroke [0.91 (0.90–0.92)], wound complication [0.90 (0.88–0.92)], bleeding requiring transfusion or secondary procedure [0.92 (0.91–0.93)], other morbidity [0.91 (0.89–0.92)], non-home discharge [0.95 (0.95–0.96)], and unplanned readmission [0.87 (0.86–0.89)] (Table 3).

Table 3 XGBoost performance on test set data for predicting 30-day primary and secondary outcomes following lower extremity open revascularization using pre-operative features.

The ROC curve for prediction of 30-day MALE or death using XGBoost is demonstrated in Fig. 1. Our model achieved good calibration with a Brier score of 0.08, indicating excellent agreement between predicted and observed evented probabilities (Fig. 2). The top 10 predictors of 30-day MALE or death in our XGBoost model were the following: (1) symptom status: CLTI, (2) pre-operative dialysis, (3) functional status, (4) pre-operative CHF, (5) pre-operative creatinine, (6) urgency of surgery, (7) procedure type: conduit/target/inflow, (8) physiologic high-risk factor, (9) pre-operative antiplatelet, and (10) anatomic high-risk factor (Fig. 3). On subgroup analysis based on symptom status, 9/10 of the most important predictive features were the same for patients with CLTI and those who were asymptomatic or had claudication, with the two most important predictors being functional status and pre-operative dialysis for both groups (Supplementary Fig. 1).

Figure 1
figure 1

Receiver operating characteristic curve for predicting 30-day major adverse limb event or death following lower extremity open revascularization using Extreme Gradient Boosting (XGBoost) model. AUROC (area under the receiver operating characteristic curve), CI (confidence interval).

Figure 2
figure 2

Calibration plot with Brier score for predicting 30-day major adverse limb event or death following lower extremity open revascularization using Extreme Gradient Boosting (XGBoost) model.

Figure 3
figure 3

Variable importance scores (gain) for the top 10 predictors of 30-day major adverse limb event or death following lower extremity open revascularization in the Extreme Gradient Boosting (XGBoost) model. Abbreviations: CLTI (chronic limb threatening ischemia), CHF (congestive heart failure).

Subgroup analysis

Our XGBoost model performance for predicting 30-day MALE or death remained excellent on subgroup analysis of specific demographic and clinical populations with the following AUROC’s (95% CI): age < 70 [0.93 (0.92–0.94)] and age > 70 [0.94 (0.93–0.95)] (Supplementary Fig. 2), males [0.94 (0.93–0.95)] and females [0.93 (0.91–0.94)] (Supplementary Fig. 3), White patients [0.93 (0.92–0.94)] and non-White patients [0.93 (0.92–0.94)] (Supplementary Fig. 4), Hispanic patients [0.93 (0.92–0.94)] and non-Hispanic patients [0.93 (0.92–0.94)] (Supplementary Fig. 5), CLTI patients [0.93 (0.92–0.94)] and asymptomatic/claudication groups [0.94 (0.93–0.95)] (Supplementary Fig. 6), femoropopliteal bypass [0.94 (0.93–0.95)], femoral to tibial/pedal bypass [0.93 (0.92–0.94)], popliteal to tibial/pedal bypass [0.93 (0.91–0.95)], and femoral endarterectomy/profundoplasty [0.93 (0.89–0.96)] (Supplementary Fig. 7), and urgent/emergent surgery [0.94 (0.93–0.95)] and elective surgery [0.93 (0.92–0.94)] (Supplementary Fig. 8).

Discussion

Summary of findings

Using data from the ACS NSQIP targeted vascular files between 2011 and 2021 consisting of 24,309 patients who underwent lower extremity open revascularization for atherosclerotic disease, we developed ML models that accurately predict 30-day MALE or death with an AUROC of 0.93 using pre-operative variables. Furthermore, our algorithms predicted 30-day untreated loss of patency, major reintervention, major amputation, death, MACE, MI, stroke, wound complication, bleeding, other morbidity, non-home discharge, and readmission with AUROC’s between 0.87 and 0.96. There were several other key findings. First, patients who develop 30-day MALE or death represent a high-risk population with several predictive features at the pre-operative stage. Specifically, they are older with more comorbidities, have poorer functional status, and are more likely to have high-risk physiologic and anatomic factors. In addition, a greater proportion of patients with 30-day MALE or death had CLTI, underwent tibial/pedal bypasses, and required concurrent minor amputation or endovascular revascularization. Despite these differences, they were less likely to receive optimal medical therapy including antiplatelets. This represents an important opportunity to improve medical management of PAD patients. Second, we trained 6 ML models to predict 30-day MALE or death using pre-operative features and showed that XGBoost achieved the best performance. Our model was well-calibrated, achieving a Brier score of 0.08, and remained robust on subgroup analysis based on age, sex, race, ethnicity, symptom status, procedure type, and urgency of surgery. Finally, we identified the top 10 predictors of 30-day MALE or death in our ML models. These features can be used by clinicians to identify factors that contribute to risk predictions, thereby guiding patient selection and pre-operative optimization. For example, patients with modifiable high-risk factors could be further evaluated and optimized through pre-operative consultations with anesthesiologists or cardiologists to mitigate adverse events58,59. Overall, we have developed a robust ML-based surgical risk prediction tool that can help guide clinical decision-making to improve outcomes and reduce costs from complications, reinterventions, and readmissions associated with lower extremity open revascularization.

Comparison to existing literature

Bertges et al. developed the VSGNE CRI to predict in-hospital major adverse cardiac events in patients undergoing major vascular procedures including lower extremity bypass, carotid endarterectomy, and aortic aneurysm repair14. Using logistic regression, their model achieved an AUROC of 0.7114. Applying ML techniques to a more up-to-date cohort specifically consisting of patients undergoing lower extremity open revascularization, we achieved better performance with an AUROC of 0.93.

Bonde et al. trained ML algorithms on a cohort of NSQIP patients undergoing > 2900 unique procedures to predict peri-operative complications, achieving AUROC’s of 0.85–0.8818. Given that patients undergoing lower extremity open revascularization for atherosclerotic disease represent a unique population with a high number of vascular comorbidities, the applicability of general surgical risk prediction tools may be limited. By developing ML algorithms specific to patients undergoing lower extremity open revascularization, we achieved AUROC’s > 0.90. Additionally, we included limb- and graft-related outcomes such as major amputation, major reintervention, and untreated loss of patency, which are of clinical importance to vascular surgeons. Therefore, there is value in building procedure-specific ML models, which can increase accuracy and clinical applicability.

Prediction models specific to patients undergoing lower extremity revascularization remain limited. Miyata et al. (2021) applied logistic regression to predict 30-day major amputation or death in a cohort of 2906 patients identified through the Japan Critical Limb Ischemia Database, achieving an AUROC of 0.8210. Using a cohort of 24,309 patients in the multi-national NSQIP database, we achieved an AUROC > 0.90 for predicting MALE or death with ML techniques. This demonstrates the benefits of applying advanced analytical techniques to larger and more diverse datasets.

Explanation of findings

There are several explanations for our findings. First, patients who develop MALE or death following lower extremity revascularization represent a high-risk group with multiple vascular risk factors, as corroborated by previous literature60. The use of antiplatelet therapy is a Grade 1A recommendation by multiple societal guidelines for all patients regardless of symptom status (asymptomatic, claudication, or CLTI)8,61,62,63, yet patients who developed MALE or death in our cohort were less likely to receive antiplatelets. The suboptimal rates of best medical therapy for PAD patients are further demonstrated in the recently published BEST-CLI trial6. Therefore, there are important opportunities to improve care for patients by understanding their perioperative risk and medically optimizing them prior to revascularization. Second, our ML models performed better than existing tools for several reasons. Compared to traditional logistic regression, advanced ML techniques can better model non-linear, complex relationships between inputs and outputs64,65. This is especially important in health care data, as patient outcomes can be influenced by many demographic and clinical factors66. Our best performing algorithm was XGBoost, which has unique advantages including avoiding overfitting and faster computing while maintaining precision67,68,69. Furthermore, XGBoost works well with structured data, which may explain why it performed better than more complex algorithms such as neural networks on our dataset70. Third, our model performance remained robust on subgroup analysis of specific demographic and clinical populations. This is an important finding given that algorithm bias against underrepresented populations is a significant issue in ML studies71. We were likely able to avoid these biases due to the excellent capture of sociodemographic data by ACS NSQIP, a multi-national database that includes diverse patient populations72,73. Fourth, a small proportion of patients in our cohort underwent lower extremity open revascularization for asymptomatic disease (< 2%). The reasons for these interventions are unclear from our dataset but may be related to revisions for hemodynamically significant stenoses of previous revascularization procedures, patient preference, poor adherence to guideline-directed revascularization, or coding errors74.

Implications

Our ML models can be used to guide clinical decision-making in several ways. Pre-operatively, a patient predicted to be at high risk of an adverse outcome should be further assessed in terms of modifiable and non-modifiable factors75. Patients with significant non-modifiable risk of adverse outcomes following open surgical revascularization may benefit from careful considerations of alternative options including medical management alone or less invasive endovascular therapy76,77. Those with modifiable risks should be referred to anesthesiologists, cardiologists, and/or internal medicine specialists for further evaluation58,59. Intra-operatively, risk predictions may inform decisions regarding anesthetic techniques such as neuraxial vs. general anesthesia78. At the post-operative stage, patients at high risk of adverse events may benefit from close monitoring in the intensive care unit79. Additionally, patients at high risk of non-home discharge or readmission should receive early support from allied health professionals to optimize safe discharge planning80. These peri-operative decisions guided by our ML models have the potential to improve outcomes and reduce costs by mitigating adverse events.

The programming code used to develop our ML models is publicly available through GitHub, a web-based platform that offers a free and integrated environment for hosting source code, documentation, and project-related web content for open-source projects81. These tools can be used by any clinician involved in the peri-operative management of patients being considered for lower extremity open revascularization. On a systems-level, our models can be readily implemented by the > 700 centres that currently participate in ACS NSQIP worldwide. They also have potential for use at non-NSQIP sites, as the input features are commonly captured variables for the routine care of vascular surgery patients82. Given the challenges of deploying prediction models into clinical practice, consideration of principles of implementation science is critical83. Our ML models have the advantage of providing automated risk predictions using many input variables, thereby improving practicality in busy clinical settings compared to traditional risk predictors that generally require manual input of variables13. Specifically, our algorithms were built to autonomously extract a patient’s prospectively collected NSQIP data to make risk predictions. Ongoing efforts to link NSQIP data to electronic health records has the potential to increase the clinical utility of our model and further support fully automated risk predictions84,85. We advocate for dedicated health care data analytics teams at the institution level, as their significant benefits have been previously demonstrated and model implementation can be facilitated by these experts86,87. Through this study, we have also provided a framework for the development of robust ML models that predict lower extremity open revascularization outcomes, which can be applied by individual centers for their specific patient populations.

Limitations

Our study has several limitations. First, our models were developed using ACS NSQIP data. Future studies should assess whether performance can be generalized to institutions that do not participate in ACS NSQIP. Second, the ACS NSQIP database captures 30-day outcomes. Evaluation of ML models on data sources with longer follow-up would augment our understanding of long-term surgical risk. Third, our dataset did not capture low-dose rivaroxaban use. Given that the VOYAGER60 and COMPASS88 trials demonstrated the cardiovascular and limb benefits of low-dose rivaroxaban, future prediction models on datasets that capture this variable may improve performance. Fourth, our models are limited to patients undergoing open revascularization. Future prediction tools for outcomes following endovascular therapy would be helpful to further guide clinical decision-making.

Conclusions

In this study, we used the ACS NSQIP database to develop robust ML models that pre-operatively predict 30-day MALE or death following lower extremity open revascularization for atherosclerotic disease with excellent performance (AUROC 0.93). Our models also predicted untreated loss of patency, major reintervention, major amputation, death, MACE, MI, stroke, wound complication, bleeding, other morbidity, non-home discharge, and readmission with AUROC’s between 0.87 and 0.96. Given that our ML algorithms perform better than existing tools and logistic regression, they have potential for important utility in the peri-operative management of patients being considered for lower extremity open revascularization to mitigate adverse outcomes and reduce health care costs.