Introduction

Nonalcoholic fatty liver disease (NAFLD) is an acquired metabolic disease characterised by fatty deposits in the liver1. The prevalence of NAFLD is increasing with the standard of living of the population2, and it is gradually becoming the most common chronic liver disease worldwide, posing an increasing economic and medical burden3,4. NAFLD prevalence is higher in China than in some developed countries because of the variability of disease prevalence and lifestyle changes. A projection of the future burden of NAFLD shows China is expected to have 314 million patients with NAFLD by 20305. Some studies have shown that NAFLD patients have a higher risk of cardiovascular diseases, such as hypertension, coronary heart disease, cardiomyopathy, and cardiac arrhythmias, than the general population6,7.

Hypertension is a common cardiovascular condition responsible for 9.4 million deaths worldwide every year8. Its clinical diagnostic criteria are a systolic blood pressure (SBP) ≥ 140 mmHg or diastolic blood pressure (DBP) ≥ 90 mm Hg9. China ranks first worldwide for deaths due to Hypertension, and the proportion elderly people suffering from Hypertension in China is > 60%10,11. Recent studies have found that the risk of hypertension is notably higher in patients with NAFLD than in other populations12,13,14, and there is a strong association between the two. An observational study showed that NAFLD was associated with a 1.6-fold risk of developing hypertension15. The prevalence of NAFLD combined with hypertension is approximately 39.34%2,16,17, and the quality of life of patients with NAFLD-Hypertension is lower than for those with either condition alone.

Clinical prediction models (CPMs) are used as a quantitative tool for assessing the risks and benefits of particular clinical scenarios. They can provide medical practitioners with scientifically valid data and their application is becoming increasingly common18. However, most CPMs are established in developed countries, with only a few being applied in developing and underdeveloped countries. Through a literature review, we found few reports on the establishment of the CPMs for Hypertension in China, and reports differed significantly between regions8,19,20,21,22. Therefore, it would be of great practical value to establish CPMs for different regions and populations in China and propose health management strategies based on the research findings.

The Hypertension risk-prediction model for elderly patients with NAFLD developed in this study could, to some extent, fill the research gap in terms of regions and populations. Using multiple indicators obtained from questionnaires, physical examinations and biochemical tests, we can predict the risk of concurrent Hypertension with NAFLD in elderly patients. This can therefore help medical practitioners to identify and select high-risk individuals for non-medical health interventions at an early stage to delay or prevent NAFLD patients from developing Hypertension.

Results

Basic statistical description

The basic demographic information and clinical examination results of the study population were collected primarily through questionnaires, physical examinations, and biochemical tests. After processing, the final 24 basic variables in the analysis were: age, SBP, DBP, body mass index (BMI), waist-to-hip ratio (WHR), and the concentrations in each sample of albumin (ALB), alanine aminotransferase (ALT), aspartate aminotransferase (AST), red blood cells (RBC), UREA, glucose (GLU), haemoglobin (HGB), platelets (PLT), total cholesterol (TC), total bilirubin (TB), creatinine (CRE), low-density lipoprotein (LDL) triglycerides (TG), uric acid (UA), alpha-fetoprotein (AFP), basophils (BASO), eosinophils (EOS), lymphocytes (LYMPH), and neutrophils (NEUT).

The data analyses of the 5287 samples collected in this study in 2017 are shown in Table 1, illustrating the basic statistical description of the sample data for NAFLD and non-NAFLD groups. The NAFLD group is used as an internal training set. The 24 variables used in the analysis do not meet the normal distribution assumption due to the large sample size. The mean ± standard deviation was used to describe the results, and the chi-squared test and t-test were used to analyse whether the differences between the groups were statistically significant.

Table 1 Differences in demographic and clinical characteristics between the No-NAFLD and NAFLD groups in training set. [mean ± SD or N (%)].

Screening results of characteristic variables

In the training set, nine characteristic variables out of 24 basic variables were screened for using the Lasso model and intersected with 18 characteristic variables screened for using RF to obtain the following common variables: age, SBP, BMI, ALT, UREA, TC, LDL, UA, and NEUT (Fig. 1). These nine common variables were included in the multivariate logistic regression analysis and as a result of which the variables of age, SBP, BMI, ALT, UREA, UA, and NEUT were maintained as the final selection after excluding the variables with p > 0.05 (Fig. 2, Table 2).

Figure 1
figure 1

Variable screening using the Lasso regression model. (A) Nine variables with non-zero coefficients were selected by deriving the optimal lambda values. (B) Coefficient profiles are generated according to optimal parameters (lambda) in (A). (C) Resulting graph of variable selection using the random forest model method. (D) Schematic diagram of the method for selection of the final characteristic variables in this study.

Figure 2
figure 2

Forest plot of the selected elements. A forest plot is used to visualise the results of the logistic regression analysis.

Table 2 Predictive factors for hypertension incidence risk in NAFLD patients.

Model construction and prediction results

Using the seven indicators selected above as independent variables and "whether NAFLD patients have concurrent Hypertension" as the dependent variable, a model was constructed using logistic regression to obtain a prediction model for the risk of concurrent Hypertension in elderly NAFLD patients:

$$\begin{aligned} y_{{{\text{model}}}} = & - 9.534 + 0.024 \cdot {\text{Age}} + 0.025 \cdot {\text{SBP}} + 0.109 \cdot {\text{BMI}} \\ & + 0.010 \cdot {\text{ALT}} + 0.077 \cdot {\text{UREA}} + 0.002 \cdot {\text{UA}} + 0.120 \cdot {\text{NEUT}} \\ \end{aligned}$$

The Hypertension risk in patients with NAFLD can be simply and visually predicted based on the nomogram drawn from this model (Fig. 3). For example, suppose a patient with NAFLD had the following characteristics: age 65 years, BMI 25.5 kg/m2, SBP 191 mmHg, ALT 106 U/L, UREA 6.2 mmol/L, UA 465.9 μmol/L, and NEUT 4.2109/L. This produces a score of 322, corresponding to a probability of 0.915, indicating that the risk of concurrent Hypertension in this NAFLD patient is 91.5%. In addition, we created an online app to predict the probability of concurrent Hypertension in elderly patients with NAFLD based on the model established in this study. The website is https://studentluo.shinyapps.io/DynNomapp/. This app can help clinicians diagnose Hypertension and community medical practitioners design reasonable and effective primary and secondary prevention management programs for Hypertension with NAFLD in elderly patients.

Figure 3
figure 3

The risk nomogram of Hypertension in patients with NAFLD. (A) Nomogram including age, SBP, BMI, ALT, UREA, TC, LDL, UA, and NEUT for analysis. (B) One patient with NAFLD was randomly selected, whose Hypertension incidence was predicted based on the seven characteristic indicators of the nomogram.

Model evaluation and comparison

To evaluate the discrimination, calibration, and clinical applicability of the risk nomogram established in this study, the model was validated. The AUC of the internal validation set was 0.707 (95% CI: 0.688–0.727) and the AUC of the external validation set was 0.688 (95% CI: 0.672–0.705) (Fig. 4). The calibration curves showed moderate agreement. The results of the decision curves indicated that, selective patient treatment based on the results nomogram results in better patient outcomes than implementing an intervention-in-all-patients scheme, in cases where the risk threshold of Hypertension in NAFLD patients was 30–56% (Fig. 5). In addition, we analysed each variable in the model and plotted the ROC curve; the results are shown in Fig. 6, Table 3.

Figure 4
figure 4

The area under the ROC curve is used to determine the predictive ability of the prediction model to distinguish outcomes, such as with or without disease. X-axis represents the false positive rate predicted by the model; Y-axis represents the true positive rate predicted by the model; the blue curve represents the performance of the nomogram. (A) The performance of the model in the training set. (B) The performance of the model in the validation set.

Figure 5
figure 5

Analyses of the calibration and decision curves of the HBP risk nomogram. (A) The X-axis of the curve represents the predicted risk of concurrent Hypertension in elderly patients with NAFLD; Y-axis represents the actual diagnosed morbidity; the diagonal line represents the perfect prediction of the ideal model, and the higher overlap of the dark dashed line with the diagonal line represents the better accuracy of the model. (B) Decision curve analysis of the Hypertension risk nomogram. The solid line parallel to the horizontal axis in the graph indicates that when all samples are negative, the net benefit of having no intervention for all is 0. Backlash shows that all samples are positive and have received intervention, and the net benefit is the magnitude of the slope. The red line represents the net benefit of the model constructed in this study. The further the red line is from the two intersecting lines, the higher the clinical application value of the model.

Figure 6
figure 6

ROC curve analysis of the seven risk factors.

Table 3 The optimal cut-off value, specificity, and sensitivity of risk factors.

Using only SBP and DBP (common indicators of Hypertension) model I was established in the NAFLD population (AUC was 0.660, 95% CI: 0.636–0.685). Model I was compared with the model constructed in this study using NRI; this produced a value of 0.109, indicating that the model constructed in this study had better predictive ability than model I (Fig. 7).

Figure 7
figure 7

Model comparison based on NRI. The NRI value was 0.109.

Discussion

Based on the prediction model, we established that age, SBP, BMI, ALT, UREA, UA, and NEUT, are the risk factors for Hypertension development in patients with NAFLD. Among previous studies on predictive models for the risk of developing Hypertension in China, a longitudinal study by Zhang et al. analysed 26,496 Hypertension patients; they extracted 5 risk factors from 11 examined biomarkers to build a Hypertension Synthetic Predictor (HSP) for Hypertension, including Blood Pressure Factors (BPF) determined by SBP and DBP23. In a prospective study by Chen et al. that included 2,785 Hypertension patients, the variables of the established risk-prediction model were age, BMI, SBP, DBP, and fasting blood glucose (FBG)24. In addition, in a review study, 26 articles reporting on 48 Hypertension prediction models were analysed (seven domestic studies; others from Europe, the United States, Korea, Japan, and Iran). The most common variables in the Hypertension prediction models included age, BMI, and blood pressure level, among others19. Most studies on Hypertension prediction models in China and abroad included the predictive variables consistent with the results of this study.

Age is an irreversible risk factor in Hypertension development. Some studies have shown that Hypertension prevalence increases with age, and there is a strong correlation between biological ageing and elevated blood pressure9,25,26,27. SBP and DBP are common diagnostic indicators of Hypertension. This study also found that SBP was one of the main predictors of concurrent Hypertension and NAFLD, and that there is a correlation between BMI and Hypertension28,29. The results of a Hypertension prediction model study based on 135,715 individuals showed that a lower BMI reduced the risk of developing Hypertension20. Among patients with Hypertension in the United States, 49.5% are obese (BMI > 28)30. Moreover, there is strong evidence that obesity (including visceral obesity) is the most common risk factor for NAFLD3,31. The World Health Organization's BMI classification criteria are not suitable for application among Asians, and therefore we followed the classification system used in a previously published report, which presented the BMI classification for a Chinese population as follows: underweight (< 18.5), normal (18.5–22.9), overweight (23.0–27.5), and obese (> 27.5)32. According to this definition, the obesity rate of subjects with NAFLD and Hypertension in the present study was 73.8%.

Studies have shown that blood pressure is an important determinant of plasma UREA concentration, with a positive correlation between blood pressure and UREA concentration. In patients with untreated Hypertension, elevated blood pressure may be caused by renal insufficiency33. UREA abnormalities are also closely associated with kidney disease development34,35, suggesting that Hypertension is a cause or consequence of kidney disease36.

UA is associated with chronic kidney disease and cardiovascular disease. It can be a predictor of developing Hypertension in adults37,38 and is positively correlated with blood pressure. An increase in UA implies an increased risk of cardiovascular disease39,40. In a cross-sectional study of 33,570 patients with Hypertension by Ren et al., the established risk-prediction model included variables such as gender, age, height, weight, and UA20. Meanwhile, Rosa et al. proposed that UA is an important NAFLD predictor variable41.

The results of previous studies suggest that the development of cardiovascular diseases, including Hypertension, is closely associated with white blood cells (WBCs), a marker of inflammation40,42,43. An abnormal rise in WBC count may predict an increased risk of developing Hypertension. NEUT is a major component of WBCs and a correlation between NEUT and Hypertension has been noted in some studies, with higher NEUT values in patients with Hypertension44. In addition, some studies reported higher NEUT values in obese than in non-obese individuals45,46.

ALT is present in the liver and is the main marker of liver injury47. The population in our study was mainly composed of patients with NAFLD, so the predictor ALT is included. A few studies have explored the association between ALT and Hypertension development, with equally few studies demonstrating an association between ALT and Hypertension48,49,50. One of the cross-sectional studies conducted in Chinese adults suggested a positive linear correlation between ALT and Hypertension development.

Elevated blood pressure is strongly associated with increased dietary sodium intake51,52,53. In an article published in 2016, researchers conducted a dietary study across 20 Chinese provinces, which revealed that the per capita salt intake was 9.1 g/day and sodium intake was 5.4 g/day, both exceeding the recommended maximum daily salt intake (6 g/day) and sodium intake (2 g/day)54. Meanwhile, excessive intake of fats, sugars, and meat in the diet coupled with reduced physical activity are the main causes of increased BMI in the population11. A high-protein diet leads to excessive amino acids, which are oxidised into energy, in turn increasing UREA synthesis in the body55. To date, the basic treatment for excessive UA is dietary control and a study by Roumeliotis et al. suggested that increased intake of vitamins, such as vitamin C and E, can reduce UA levels37. The Dietary Approaches to Stop Hypertension program (DASH), a healthy dietary model proposed by the United States in 1997, emphasises that adequate intake of vegetables, fruits, and low-fat (or skimmed) milk is essential, while excessive consumption of animal fats should be avoided. Some studies have confirmed the effectiveness of this approach11,56,57. Based on the results of the present study, a low sodium, low sugar, high vitamin C and E, and moderate polyunsaturated fatty acid (PUFA) and protein diet should be selected for individuals with NAFLD and high Hypertension risk. An increase in moderate physical exercise is also recommended.

To date, interventions such as dietary control, physical exercise, and weight loss for patients with NAFLD remain some of the main basic therapies to delay or prevent the progression of the disease58,59. In addition, lifestyle changes play an important role in controlling blood pressure and preventing cardiovascular disease60,61,62 and are effective in controlling blood pressure in Hypertension patients and populations at high-risk of Hypertension. In particular, interventions such as a rational diet, weight control, smoking and alcohol consumption cessation, and stress reduction can be adopted in the early stages of Hypertension for high-risk groups27,63,64. Considering the commonalities between NAFLD and Hypertension management, our study primarily analysed the correlation between risk factors and certain dietary habits. We aimed to provide a reference for health management based on dietary regulation to help physicians formulate individualised dietary interventions for patients with NAFLD at high risk of developing Hypertension. We also aimed to raise patients' awareness of healthy dietary management through health education, adopt interactive follow-ups for timely access and provide feedback on the implementation of individualised dietary management and the real-time dynamics of key risk indicators.

Conclusion

This study used the Lasso model based on tenfold cross-validation and an RF model to identify risk factors (age, SBP, BMI, ALT, UREA, UA, and NEUT) associated with NAFLD combined with Hypertension in the elderly population. Logistic regression models were used to obtain a simple, effective, and economical nomogram to predict the risk of comorbid Hypertension in NAFLD patients over the age of 60 in Shanghai, China.

Based on our results, for NAFLD patients with hypertension risk, we suggest that health service providers should strengthen the monitoring of seven risk indicators: age, SBP, BMI, alt, urea, UA, and NEUT. Health service providers can formulate and improve individualised intervention measures and risk monitoring systems to increase patients' quality of life and reduce the economic and medical burden imposed by NAFLD and Hypertension on the elderly, their families, and society.

This study has potential limitations. First, this is a retrospective study, and some clinical indicators with more vacancy values were eliminated, which may lead to unavoidable selection deviation. Such as some potential clinical indicators were not included. Second, this risk-prediction model has been established and validated in the Chinese Han population in Shanghai, but it lacks further study in other populations in other regions and countries. The model needs to be externally evaluated in a wider elderly population and needs further prospective research. The results of this study can therefore only be applied to the Chinese Han population; whether this model is applicable to the populations of other countries needs further verification. Third, our data cannot confirm whether individual intervention based on the predicted results of the model can reduce the risk of hypertension and improve the living standard of NAFLD patients; further large-scale and long-term follow-up studies are required to investigate this.

Materials and methods

Sample collection

The sample data for this retrospective study were obtained from the Zhangjiang, Pudong New Area, Shanghai. We relied on the Shanghai Collaborative Innovation Center of Traditional Chinese Medicine Health Service of the Shanghai University of Traditional Chinese Medicine and the Zhangjiang and Beicai Community Health Service Centers, Pudong New Area, Shanghai, to collect the relevant information. This information was regarding the participants belonging to the Han population who participated in the chronic disease health screening programs conducted at the two community health centers from April 2016 to July 2017 using questionnaires, physical examinations, and biochemical tests. This study was conducted in accordance with the basic principles of the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of the Shanghai University of Traditional Chinese Medicine. Informed consent was obtained from the study subjects and they signed an informed consent form (Fig. 8).

Figure 8
figure 8

Flow diagram of the study design.

The research group collected 6,664 participants in 2017, who were matched for age and sex, randomised, and did not have blood relationship among the study subjects. After data processing, study subjects with missing information, incomplete indicators, and those over 60 years of age were excluded. The 5287 participants with who matched all criteria were divided into two groups: an NAFLD group (N = 2845) and a non-NAFLD group (N = 2442). The NAFLD group was the internal training set, and the external validation set comprised 2031 cases, which were obtained by pre-processing the data collected by the research group in 2016. The clinical characteristic indicators of the external validation set are consistent with those of the internal training set (see Supplementary Table S2 online).

The blood pressure of the subjects was measured using the steps recommended by international guidelines, including a quiet resting period of at least 5 min before measurement. An electronic sphygmomanometer (Biospace, Cheonan, Korea) was used to record each subject's SBP and DBP (mmHg).

The clinical diagnostic criteria in the "Guidelines for the diagnosis and management of nonalcoholic fatty liver disease (2010 Revision)" developed by the Chinese Society of Hepatology, CMA65, were used to determine the diagnostic criteria for NAFLD. The following three conditions had to be met:

  1. (1)

    No history of alcohol consumption. The quantity of ethanol consumption should be < 140 g/week for men and < 70 g/week for women.

  2. (2)

    Exclusion of other causes that could lead to hepatic steatosis, including viral hepatitis, drug-induced liver disease, autoimmune liver disease, hepatolenticular degeneration, and total parenteral nutrition.

  3. (3)

    Imaging or histological findings meet the diagnostic criteria of fatty liver disease.

Ultrasonography is used to identify hepatic steatosis. Ultrasonography evaluation of hepatic steatosis typically consists of a qualitative visual assessment of hepatic echogenicity, measurements of the difference between the liver and kidneys in echo amplitude, evaluation of echo penetration into the deep portion of the liver, and determination of the clarity of blood vessel structures in the liver. According to these imaging characteristics, liver steatosis is usually graded as mild (increased echogenicity but normal visualization of vessels and diaphragm), moderate (poor visualization of the intrahepatic vessels) and severe (diaphragm and deep parenchyma are not seen).

Statistical and modelling methods

The analysis software used in this study were IBM SPSS Statistics (version: 25.0), Venny (version: 2.1.0), and R software (version: 4.1.0, https://www.R-project.org/), and the R packages were caret (version:6.0-88), DynNom (version:5.0.1), haven (version:2.4.3), readxl (version:1.3.1), rms (version:6.2-0), rmda (version:1.6), regplot (version:1.1), rsconnect (version:0.8.24), glmnet (version:4.1-2), nricens (version:1.6), pROC (version:1.17.0.1), and forestplot (version:2.1.0). The statistical tests used were all two-sided tests with test level α = 0.05.

The Lasso model based on tenfold cross-validation and RF model were used to screen the risk factors of Hypertension in patients with NAFLD. The Lasso model is suitable for variable screening of high-dimensional data, where the variables in the analysis are first pooled and normalised to reduce the regression coefficients of the variables toward zero, and the predictor variables with non-zero coefficients at the best lambda value are screened with the smallest possible prediction error39,66. The RF model is based on a recursive feature elimination approach that uses significant variables in the hierarchical ranking to train the random forest and eliminate irrelevant variables, followed by the repeated training and elimination processes until there is no variable change67.

In this study, the outcome variable was whether Hypertension occurred under NAFLD conditions. As it is a dichotomous variable, logistic regression analysis was used to construct a prediction model containing the final characteristic variables18. The Lasso model was intersected with the predictor variables screened using RF, and the identified common variables were included in the multivariate logistic regression analysis. The p > 0.05 predictor variables were gradually eliminated to obtain the final characteristic variables of the risk-prediction model, and the results were visualised using forest plots.

A nomogram can simplify and visualise complex regression equations, integrate data with a model, comprehensively consider the influential variables in the analysis, and graphically display the probability of the predicted outcome68. Therefore, we used the rms package in R software to plot the nomogram of the logistic regression model constructed in this study. The length of the line corresponding to each analysed variable in the plot indicated the degree of influence of that variable on the predicted outcome69.

Model evaluation and comparison

We used the area under the ROC curve (AUC) to evaluate the discriminatory ability of the model in internal and external validation sets70. This value is generally between 0.5 and 1, and the closer to 1, the better the discriminatory ability of the model and the higher the predictive performance21. Appropriate calibration analysis is required when applying the prediction model constructed in this study to clinical decision-making71. To achieve this, we used the calibration curve to determine the degree of agreement between the actual status and the predicted results of the model. Because there is a possibility of false positives or false negatives in the prediction results of this model, we used DCA to determine the net benefit of the prediction model in clinical applications18. The NRI proposed in 2008 to evaluate the improvement of a new model in risk-prediction ability72 was applied in this study to compare the extent to which the newly constructed prediction model outperformed the prediction model constructed based on the traditional variables used in Hypertension diagnosis.

The flow chart of this study design is shown in Fig. 8.