Introduction

Atrial fibrillation (AF) is a common arrhythmic disease associated with serious illnesses, including stroke and heart failure, that can impact patients’ lives. Anticoagulation is important for patients at risk of stroke due to AF1. Anticoagulation lowers the risk of stroke in patients with AF and could prevent up to two thirds of strokes2,3. Additionally, rhythm control, including catheter ablation for early AF, reportedly improves life outcomes4. Detecting AF is challenging and often occurs after a stroke or other serious events5. Many patients with AF have both asymptomatic and symptomatic episodes, with up to 40% of patients being asymptomatic. Asymptomatic AF carries a similar risk of stroke and mortality to symptomatic AF6. Predicting the risk of developing AF before serious related complications occur in a healthy population is critical and challenging.

Developments based on clinical risk factors for AF have been made, such as the introduction of the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE-AF) model and the Suita study7,8,9. The risk of developing AF is affected by patients’ genetic predispositions and clinical risk factors10. A risk score was reported for the CHARGE-AF score based on AF’s clinical risk factors and genetic predisposition data11. The possibility of developing AF in individuals with a high genetic risk score was considered higher than that in those with a low score, even if they had low CHARGE-AF scores. Moreover, a study has predicted the risk of developing AF using deep learning to apply artificial intelligence (AI) to electrocardiograms (ECGs)12. AI-based prediction of AF is also useful, but the cost of using AI and the nature of the decision criteria as a black box are problems encountered worldwide13. In this study, we investigated whether a simple risk model using 12-lead ECG could predict the risk of developing AF.

Methods

Study population

The flowchart of this study is shown in Fig. 1. The inclusion criteria were individuals who underwent at least two annual physical examinations with available data at baseline and after 5 years, while the exclusion criteria were individuals without basic information (height, weight, history of smoking, history of alcohol intake, or underlying diseases) or ECG data. The present study aimed to examine the risk of AF in a general healthy population with no obvious underlying disease. Participants with thyroid disease or underlying heart disease with a definite risk of AF were excluded. We retrospectively enrolled 129,204 consecutive participants, aged 30–69 years, who underwent at least two annual physical examinations with available data at baseline and after 5 years (median, 5.0 years; 1st quartile, 4.4 years; 3rd quartile, 5.1 years) at the Kagoshima Kouseiren Hospital between January 1979 and December 2016. A total of 88,907 participants were examined, excluding 37,459 who lacked sufficient data and 2838 with underlying diseases (846, 2010, and 18 with underlying heart diseases [e.g. AF, valvular disease, cardiomyopathy, congenital heart disease, and ischemic heart disease], thyroid diseases, and coexisting heart and thyroid diseases, respectively). Of the participants who lacked sufficient data, 19,708 had no basic information (height, weight, history of smoking, history of alcohol intake, or underlying diseases), and 17,751 had no ECG data. The included participants were randomly assigned to derivation and validation cohorts at a ratio of 1:1. This study was approved by the institutional ethics committee of Kagoshima University Graduate School of Medical and Dental Sciences (approval no. 170130 [520], 4th August 2017). The need for informed consent was waived by the ethics committee because only existing anonymized data were used in this study. The study complied with the principles of the Declaration of Helsinki.

Figure 1
figure 1

Flowchart of the study population. We enrolled 129,204 participants who underwent physical examinations and excluded 37,459 who lacked sufficient data and 2838 with underlying diseases. This study included 88,907 participants, aged 30–69 years, who were randomly assigned to derivation and validation cohorts at a ratio of 1:1.

Data collection

We obtained data on age, sex, height, and weight. Based on a median age of 52 years, participants were divided into two groups: aged ≤ 51 years and > 52 years. Height and weight were measured using standard anthropometric methods. The body mass index (BMI) was calculated for each participant. BMI was calculated as weight (kg) divided by height squared (m2), and participants were categorised into groups with BMIs of ≤ 24.9 kg/m2 and ≥ 25.0 kg/m2; obesity was defined based on the Japanese definition, BMI of ≥ 25.0 kg/m214. Information on the history of smoking (currently smoking or history of smoking) and alcohol intake (drinking > 10 days per month) and underlying diseases (underlying cardiac and thyroid diseases) was obtained by questioning each participant at the time of the physical examination. These data were collected from each participant’s medical record. A 12-lead surface ECG was performed for each participant during routine annual physical examination. We used automatic measurements and discriminations of 12-lead ECGs. Heart rate (HR), QRS duration (QRSd), axis range, QT interval (QTi), SV1, and RV5 were obtained by automatic ECG measurement. PR prolongation (PR > 200 ms), premature atrial contraction (PAC), and premature ventricular complex (PVC) were obtained by automatic ECG discrimination15,16. Left axis deviation (LAD) and right axis deviation (RAD) were defined as − 30° to − 90° and + 90° to + 180°, respectively17. The QT corrected for HR (QTc) was calculated using Bazett’s correction formula (QTc = QT/RR1/2). Participants were also categorised based on QRSd, ≤ 119 ms and ≥ 120 ms; QTc, ≤ 439 ms and ≥ 440 ms; and SV1 + RV5, ≤ 3.4 mV and ≥ 3.5 ms. QRS prolongation, QTc prolongation, and left ventricular hypertrophy (LVH) were defined as a QRSd of ≥ 120 ms18, QTc of ≥ 440 ms19 and SV1 + RV5 of ≥ 3.5 mV20 respectively. New-onset AF was defined as diagnosis of AF using an ECG 5 years after baseline measurements and new cases of AF diagnosed at any point during the 5-year period. The new cases of AF were based on ECG during annual physical examinations (mean number of ECGs during follow-up per participant: 3.7 ± 1.1) or based on the new AF history obtained by questioning each participant at the annual physical examinations. Secondary AF such as that caused by surgery or trauma might be included in participants who had the new AF history obtained by questioning.

Statistical analysis

Continuous variables (age, BMI, HR, QRSd, QTi, QTc, SV1, RV5, and SV1 + RV5) were presented as means ± standard deviations. Categorical variables (sex, obesity, smoking, alcoholic intake, PR prolongation, QRS prolongation, QTc prolongation, LVH, PAC, PVC, RAD, and LAD) were presented as proportions (percentages). Differences between the two groups (between the derivation and validation cohorts or between the new-onset AF and sinus rhythm [SR] groups) for continuous and categorical variables were analysed using the Student’s unpaired t-test and the χ2 test, respectively. Univariate and multivariate logistic regression analyses were applied to calculate the odds ratio (OR) and 95% confidence interval for AF incidence. Significant factors in the univariate analysis were selected as dichotomous variables rather than continuous variables for the multivariate analysis. Multivariate analysis was conducted with upper categorical variables such as LVH or QTc prolongation, not with continuous variables such as RV5, RV1 + RV5, QT interval, or QTc interval. Obesity was analysed using the Japanese definition of obesity, BMI of ≥ 25.0 kg/m214. To create a risk score that predicts 5-year incidence of AF, the following scores related to standardised beta coefficients were assigned to each risk factor category for items that were significant in the multivariate analysis based on the methodology used in the Japan Epidemiology Collaboration on Occupational Health Study Group’s findings: 1, β = 0.01–0.20; 2, β = 0.21–0.80; 3, β = 0.81–1.20; 4, β = 1.21–2.20; 5, β > 2.2021. The discriminative performance of the score was assessed using the area under the curve (AUC) from the receiver operating characteristic (ROC) analysis. Cochran-Armitage trend tests were performed to examine the constant trend toward higher incidence of AF with an increasing risk score. We evaluated the calibration using calibration plots. All data analyses were performed using the JMP Pro version 17 software (SAS Institute Inc, Cary, NC, USA. https://www.jmp.com/en_us/software/predictive-analytics-software.html). A p value of < 0.05 was considered statistically significant.

Results

Baseline characteristics

The baseline characteristics of the study population in the derivation and validation cohorts are shown in Table 1. No significant differences were observed between derivation and validation cohorts for all the factors. In the whole study population, the mean age was 51 ± 10 years, with 40,216 males (45.2%). The mean BMI was 23.2 ± 3.2 kg/m2, and the percentages of participants with obesity, those with a history of smoking, and those with a history of alcohol intake were 26.0%, 34.9%, and 53.2%, respectively. Concerning the ECG characteristics, the mean HR was 65.6 ± 10.4 bpm; moreover, PR and QRS prolongations were observed in 819 and 3,504 participants (0.9% and 3.9%), respectively. The mean QTc interval and SV1 + RV5 were 402.8 ± 23.1 ms and 2.61 ± 0.79 mV, respectively. Additionally, QTc prolongation, LVH, PAC, and LAD were observed in 5141, 11,538, 856, and 1086 participants (5.8%, 13.0%, 1.0%, and 1.2%), respectively.

Table 1 Baseline characteristics.

Incidence of AF

Among the included participants, new-onset AF was observed in 152 (0.3%) during the 5-year period. Comparisons between participants with new-onset AF (the AF group) and those with SR (the SR group) are shown in Table 2. Age and BMI were higher in the AF group than those in the SR group (age, 59 ± 8 years vs. 51 ± 10 years, p < 0.001; BMI, 23.8 ± 3.4 kg/m2 vs. 23.2 ± 3.1 kg/m2, p = 0.016). The percentages of males, smoking, and alcohol intake were higher in the AF group than those in the SR group (males, 73.0% vs. 44.9%, p < 0.001; smoking, 52.0% vs. 35.6%, p < 0.001; alcohol intake, 65.1% vs. 53.2%, p = 0.003). No significant difference was observed in the percentage of obesity between the two groups. Concerning ECG characteristics, no significant difference in HR was found between the two groups. The percentages of PR, QRS, and QTc prolongations were higher in the AF group than those in the SR group (PR prolongation, 4.0% vs. 0.9%, p < 0.001; QRS prolongation, 7.2% vs. 3.9%, p = 0.031; QTc prolongation, 10.5% vs. 5.9%, p = 0.014). The QTc interval and SV1 + RV5 were higher in the AF group than those in the SR group (QTc interval, 407.2 ± 27.5 ms vs. 402.9 ± 23.2 ms, p = 0.021; SV1 + RV5, 3.00 ± 1.00 mV vs. 2.61 ± 0.79 mV, p < 0.001). The percentages of LVH, PAC, PVC, and LAD were higher in the AF group than those in the SR group (LVH, 25.7% vs. 12.8%, p < 0.001; PAC, 7.2% vs. 1.0%, p < 0.001; PVC, 3.3% vs. 1.2%, p = 0.019; LAD, 4.6% vs. 1.3%, p < 0.001). No significant difference was observed in RAD between the two groups.

Table 2 Differences between the new-onset AF and SR groups.

Risk factors of AF incidence

The results of univariate and multivariate logistic regression analyses are shown in Table 3. Univariate analysis showed significant differences in age, sex, BMI, smoking, and alcohol intake between the two groups. Concerning ECG findings, no significant difference was observed in HR between the groups. However, significant differences were identified in PR, QRS, and QTc prolongations, and LVH, PAC, PVC, and LAD between the groups. In multivariate analysis, age of ≥ 52 years (OR 3.81, p < 0.001), male sex (OR 3.26, p < 0.001), PR prolongation (OR 2.60, p = 0.025), QTc prolongation (OR 1.96, p = 0.014), LVH (OR 1.64, p = 0.010), PAC (OR 5.75, p < 0.001), and LAD (OR 2.46, p = 0.026) were independent prognostic factors.

Table 3 Univariate and multivariate analyses for the incidence of AF.

Simple risk scores for AF incidence

Standardised beta coefficients were calculated for the factors that were significant in the multivariate analysis (Table 4). Values for age (for those aged ≥ 52 years), sex (for males), PR prolongation, QTc prolongation, LVH, PAC, and LAD were 1.34, 1.18, 0.96, 0.67, 0.49, 1.75, and 0.90, respectively. Based on standardised beta coefficients, age (for those aged ≥ 52 years), sex (for males), PR prolongation, QTc prolongation, LVH, PAC, and LAD score were 4, 3, 3, 2, 2, 4, and 3, respectively. In the derivation cohort, the incidence of AF increased with increased simple predicting AF score (Fig. 2a). The incidence of AF reached ~ 0.4% at six points and > 2% at 10 points. The ROC curve for the discriminative ability of the generated scores to identify the incidence of AF is shown in Fig. 3a. The AUC was 0.75 (cut-off value of six points with a sensitivity and specificity of 69% and 71%, respectively). The SIMP3L2E AF risk score (Simple information [age, sex], PR interval, Prolongation of QTc, PAC, LVH, and LAD by ECG AF risk score) was applied to the validation cohort after confirming the results with the derivation cohort. In addition, the incidence of AF increased as SIMP3L2E predicting AF score increased (Fig. 2b). The ROC curve for the discriminative ability of the generated scores to identify the incidence of AF is shown in Fig. 3b. The AUC was 0.73 (cut-off value of six points with a sensitivity and specificity of 64% and 71%, respectively). Furthermore, the results adapted to the validation cohort were comparable to those of the derivation cohort. In the Cochran-Armitage trend test, both the derivation and validation cohorts were significant (derivation cohort, p < 0.001; validation cohort, p < 0.001), and a constant trend was identified in the incidence of AF with an increasing risk score. The results of the calibration are shown in Fig. 4. Good visual calibration is achieved for both the Derivation (Fig. 4a) and Validation cohorts (Fig. 4b).

Table 4 Standardised beta coefficients and assigned points for the risk factors.
Figure 2
figure 2

Incidence of AF in the simple predicting AF score. A higher simple predicting AF score correlated with a higher the incidence of AF ((a) derivation cohort, (b) validation cohort). AF atrial fibrillation.

Figure 3
figure 3

ROC curve for the incidence of AF and the simple predicting AF score. ROC curves were developed for the incidence of AF and the score in the derivation cohort (a) and validation cohort (b); the AUC was 0.75 and 0.73, respectively. AF atrial fibrillation, AUC area under the curve, ROC receiver operating characteristic.

Figure 4
figure 4

Calibration plots for the equation model in derivation and validation cohorts. The visual agreement between the AF predictions (predicted probability) and observations (Actual probability) for the equation model in the derivation cohort (a) and validation cohort (b). AF atrial fibrillation.

Discussion

In this study, we showed the possibility of predicting new-onset AF using ECG findings and simple information, such as age and sex. Age of ≥ 52 years, male sex, PR prolongation, QTc prolongation, LVH, PAC, and LAD were independent prognostic factors and combined as the SIMP3L2E AF risk score. A higher calculated score using standardised beta coefficients correlated with a higher incidence of new-onset AF. The AUCs for the derivation and validation cohorts were 0.75 and 0.73, respectively. The incidence of new-onset AF reached > 2% at ten points of the risk score in both cohorts. To the best of our knowledge, this is the first report of a risk score based mainly on ECG to predict new-onset AF.

AF is the most common arrhythmia in clinical practice, and its incidence is rising globally22. It is a potentially health-threatening condition associated with an increased risk of ischaemic stroke, heart failure, cognitive impairment, and death; the presence of AF increases the risk of stroke five-fold23. Cerebral infarction due to AF thrombus is associated with a more extensive cerebral infarction than other types, with a significant impact on disability and mortality24. Furthermore, an estimated one-third of AF cases are asymptomatic, and stroke is not rare as a first symptom5,25,26. AF has been associated with sudden death, making detection and prediction of AF risk important in healthy individuals27. Age was also reported as a major risk factor for AF, and AF prevalence was higher with increasing age23. The prevalence of AF in males in the general United States population was 0.2%, 0.9%, 1.7%, 3.0%, 5.0%, 7.3%, 10.3%, and 11.1% for those aged < 55, 55–59, 60–64, 65–69, 70–74, 75–79, 80–84, and > 84 years, respectively. Additionally, the prevalence of AF for females was 0.1%, 0.4%, 1.0%, 1.7%, 3.4%, 5.0%, 7.2%, and 9.1% for those aged < 55, 55–59, 60–64, 65–69, 70–74, 75–79, 80–84, and > 84 years, respectively28. In contrast, in a previous study in Japan, the prevalence of AF for males in the general population was 0.2%, 0.8%, 1.9%, 3.4%, and 4.4% for those aged 40–49, 50–59, 60–69, 70–79, and > 79 years, respectively. Moreover, the prevalence of AF for females was 0.04%, 0.1%, 0.4%, 1.1%, and 2.2% for those aged 40–49, 50–59, 60–69, 70–79, and > 79 years, respectively29. The prevalence was higher in the United States for males and females than that in Japan. In the current study of participants aged 30–69 years, the prevalence of AF for males at baseline was 0.1%, 0.3%, 0.5%, and 1.3% for participants aged 30–39, 40–49, 50–59, and 60–69 years, respectively. The prevalence of AF for females at baseline was 0.05%, 0.05%, 0.2%, and 0.3% for those aged 30–39, 40–49, 50–59, and 60–69 years, respectively. The values representing the prevalence of AF in this study were comparable to those of the previous Japanese study29.

Age, sex, and hypertension are reportedly associated with risk factors for AF30. Furthermore, obesity and sleep disturbances are reportedly associated with the incidence of AF31,32. Meta-analyses showed that drinking and smoking habits were risk factors for AF33,34,35. Lifestyle-related diseases, such as hypertension, diabetes, and hyperuricemia, are also reported to be risk factors for AF. Moreover, scores using these risk factors have been reported to assess the risk of developing AF7,8,30,36,37. In this study, reviewing obesity, smoking habits, drinking habits, ECG findings, and simple information such as age and sex showed the possibility of predicting new-onset AF.

Several studies have examined risk factors for AF, which could be found using ECG. Interatrial, first-degree atrioventricular (AV), and right bundle branch blocks are reportedly risk factors for the incidence of AF38,39,40. ECG measurements and findings, such as PAC, p wave, and LVH, are reportedly risk factors for the incidence of AF41. Additionally, LAD has been related to the incidence of AF41. The prolonged QT interval has been reported to be associated with an increased risk of incident AF42. Recently, AI-based deep learning has been used to predict AF incidence12,43,44. In this study, after using general ECG measurements for evaluation, PR prolongation, QTc prolongation, LVH, PAC, and LAD were found to be independent predictors of the incidence of AF. Several AF risk scores for the incidence of AF have been previously proposed. The FHS clinical AF risk score was reported in 2009 with an AUC of 0.78 and validated with an AUC of 0.734 by Shulman et al. in 201630,45. The ARIC AF risk score was reported in 2011 with an AUC of 0.76546. These AF risk scores include both clinical characteristics and ECG parameters. The FHS AF risk score included age, sex, other clinical characteristics, and ECG-based PR interval. The ARIC AF risk score included age, other clinical characteristics, and LVH, which was often measured with ECG. AF risk scores also have been demonstrated with AUCs of 0.716–0.765 by CHARGE-AF, HATCH, and C2HEST in studies by Alonso et al. in 2013, Suenari et al. in 2017, and Li et al. in 20197,47,48,49. The HATCH and C2HEST scores were developed in Asian populations and did not use ECG parameters. To the best of our knowledge, no risk scores have been developed based mainly on ECG for predicting new-onset AF. The identified risk factors in this study were used to create a simple risk score using age, sex, and ECG measurements. The incidence of AF reached about 0.4% at six points and > 2% at ten points, and the AUCs for the derivation and validation cohorts were 0.75 and 0.73, respectively, in this study. The results were found to be comparable to previous risk scores. Although the cut-off value was six points, we considered that a cut-off value of ten points or more might be useful, as the incidence of AF increased frequently at that value. The risk score in this study demonstrated comparable predictive ability for AF in the general population using only ECG testing, in addition to clinical information such as age and sex. The advantage of this risk score is that it can predict AF using only existing ECG tests, without the need for complex tests or information. It might also be useful in elucidating the mechanism of ECG-based AF prediction using AI, which is problematic owing to the nature of its decision criteria as a black box.

This study had some limitations. First, this was a retrospective single-centre cohort study; thus, selective bias may have occurred. Therefore, another multicentre study or one with more cases than those used in our study is needed. Second, mechanical errors may have occurred in the automatic measurement and discrimination of ECGs. In addition, different machines were used, which may contribute to differences in discrimination owing to variations. The study was retrospective and could not include detailed ECG assessments, such as the shape and potential of the P wave, which could be important for the prediction of AF. Because it was not possible to examine the information about details at the time of AF, secondary AF such as that resulting from surgery or trauma might be included. Our data were collected over a long time period, and all available data was used to ensure the largest possible number of events. However, lifestyle changes throughout the time period may affected dietary habits and frequency of illnesses, and the use of outdated and inconsistent ECG machines may have affected the results. Prospective studies are needed to resolve these issues. Third, not all instances of AF may be have been detected. To solve this problem, improving evaluation methods using smart watches and long-term Holter ECGs is needed. Lastly, this study was performed on the general population; therefore, the low rate of AF incidence was considered a limitation. As the physical examinations in this study were conducted in participants aged < 70 years, no data were available for the older population, aged > 70 years, with a higher prevalence of AF. Therefore, obtaining data from a larger number of older individuals with a higher incidence of AF than those used in our study is necessary.

In this study, age of ≥ 52, male sex, PR prolongation, QTc prolongation, LVH, PAC, and LAD were independent prognostic factors for AF. We demonstrated the possibility of predicting new-onset AF using ECG findings and simple information such as age and sex. We did this by developing a simple score that does not require advanced techniques. Furthermore, owing to existing ECG tests, this methodology may be easily used in many hospitals and clinics. Notably, a large prospective study using improved evaluation methods to analyse data from a larger number of older individuals from the general population will help validate our results.