Introduction

According to statistics from the International Agency for Research on Cancer (IARC) of the World Health Organization, there were 1.93 million new cases of colorectal cancer (CRC) worldwide in 2020, with approximately 930,000 deaths. CRC ranks third in incidence among all cancers and second in mortality1. In recent years, a concerning trend has emerged with a rise in CRC cases among younger age groups, where incidence rates have doubled in individuals under 50 years old2. Regarding survival rates, early-stage CRC has a 5-year survival rate exceeding 90%, whereas it plummets to below 10% in late-stage cases3. Therefore, emphasizing early screening is crucial in reducing CRC mortality rates.

Colonoscopy is widely regarded as the gold standard for both screening and diagnosing CRC. Early detection of high-risk individuals and prompt intervention play a pivotal role in lowering CRC mortality rates and positively impacting treatment outcomes4. Nevertheless, colonoscopy, being an invasive procedure, carries inherent risks like bleeding and perforation. The necessity for meticulous bowel preparation can contribute to an unfavorable patient experience, diminishing willingness to adhere to regular follow-ups. These factors, in turn, constrain the broad utilization of colonoscopy for early CRC screening5,6. Presently, entities like the American Gastroenterological Association and the Chinese Medical Association's Society of Gastroenterology advocate for a "two-step" screening approach for CRC7,8. The initial phase employs low-invasive screening techniques like hematological tests and multi-target stool DNA testing. Hematological tests, known for their simplicity, speed, safety, and minimally invasive, can capture evolving changes in CRC progression, proving valuable in screening and identifying high-risk individuals within ostensibly healthy populations. The subsequent stage entails a comprehensive colonoscopy, strategically optimizing colonoscopy resources and playing an important role in the early detection and intervention of CRC.

Recent studies reveal that alterations in blood lipids are a distinctive characteristic observed in various malignancies, including CRC patients9. Bile acids, metabolic products of cholesterol in the liver and processed in the intestine10, have been implicated in the risk of CRC due to disruptions in lipid and bile acid metabolism11,12. In this study, we assessed a panel of 9 serum lipids, encompassing total cholesterol (TC), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), and apolipoproteins (Apo) A1, ApoA2, ApoB, ApoC2, and ApoC3, alongside 15 serum bile acids and 6 tumor markers including carcinoembryonic antigen (CEA), carbohydrate antigens (CA) 125, CA19-9, CA242, CA50, and CA72-4. This investigation involved 91 CRC patients and 120 healthy volunteers. The application of a data analysis model was employed to explore its value in identifying CRC.

Materials and methods

Study subjects and enrollment criteria

Samples were collected from August 2022 to February 2023 at Renji Hospital affiliated with Shanghai Jiao Tong University School of Medicine. The study participants included 91 CRC patients, comprising the CRC group, and 120 healthy volunteers, forming the healthy control (HC) group. Inclusion criteria for the CRC group were as follows: patients newly diagnosed with primary CRC through histopathological examination, who had not undergone radiation, chemotherapy, or surgical treatment. All of the CRC patients were adenocarcinoma. Healthy volunteers were recruited from individuals undergoing medical check-ups at our center. After excluding colorectal cancer and other gastrointestinal diseases through colonoscopy, as well as excluding malignant tumors, inflammation, and cardiovascular diseases through thoracic CT scans, electrocardiograms, abdominal ultrasounds, and a comprehensive evaluation including fecal occult blood tests and blood examinations (such as routine blood tests, liver function, renal function, etc.), they were defined as healthy. From October 2023 to December 2023, we collected an additional 22 CRC patients and 22 healthy volunteers, matched based on the same inclusion criteria, to form an external validation set. There were no statistically significant differences in age and gender between the groups (P > 0.05).

Methods

Sample preprocessing

Four milliliters of fasting whole blood samples were collected using red top serum separator tube (Gongdong, China). These samples were centrifuged at 2685 × g for 10 min to separate the serum, with lipemic samples excluded. The serum was then stored at − 80 °C until analysis to avoid repeated freeze–thaw cycles. We utilized approximately one milliliter of serum for testing various indicators, strictly following the manufacturer's instructions throughout the testing process.

Lipid indicator measurement

Levels of TC, TG, HDL-C, LDL-C, ApoA1, ApoA2, ApoB, ApoC2, and ApoC3 were measured using the H7600 Series Automatic Analyzer (Hitachi, Japan). Reagent kits for ApoA1 and ApoB, utilizing immunoturbidimetric methods, were obtained from Maccura Biotechnology Co., Ltd. Similarly, reagent kits for ApoA2, ApoC2, and ApoC3, utilizing immunoturbidimetric methods, were obtained from Beijing Leadman Biochemistry Co., Ltd. Reagent kits for TC, TG, HDL-C, and LDL-C, utilizing colorimetric methods, were obtained from Fujifilm Wako Pure Chemical Co., Ltd., Japan.

Detection of bile acid profile

The bile acid profile was detected using liquid chromatography-tandem mass spectrometry (LC–MS/MS) with the API3200MD triple quadrupole mass spectrometer (ABSciex, USA) and Shimadzu series liquid chromatograph (Shimadzu, Japan). The reagent kits were purchased from Shanghai ClinMeta Co., Ltd. The assay comprised 15 components: cholic acid (CA), glycocholic acid (GCA), taurocholic acid (TCA), chenodeoxycholic acid (CDCA), glycochenodeoxycholic acid (GCDCA), taurochenodeoxycholic acid (TCDCA), deoxycholic acid (DCA), glycodeoxycholic acid (GDCA), taurodeoxycholic acid (TDCA), lithocholic acid (LCA), glycolithocholic acid (GLCA), taurolithocholic acid (TLCA), ursodeoxycholic acid (UDCA), glycoursodeoxycholic acid (GUDCA), and tauroursodeoxycholic acid (TUDCA).

Detection of tumor markers

CEA, CA125, CA19-9, and CA72-4 were assayed using the Cobas e801 automatic analyzer (Roche, Switzerland), with corresponding reagent kits. CA50 and CA242 were analyzed using the Maglumi2000 automatic analyzer (Snibe, China), also with corresponding reagent kits. Chemiluminescence was the method of detection for all markers.

Statistical methods

Statistical analyses were performed using SPSS 17.0 and GraphPad 7.0. The Kolmogorov–Smirnov test was utilized for normality testing. Quantitative data with a normal distribution were reported as mean ± standard deviation (\(\overline{{\text{x}} }\) ± s) and compared using independent sample t-tests. Non-normally distributed quantitative data were presented as median (Q1, Q3) and analyzed using Mann–Whitney U test. A P-value less than 0.05 was considered statistically significant.

The indicators showing statistically significant differences between CRC group and HC group with AUC greater than 0.7 would be selected for developing a stepwise binary logistic regression model (Backward Likelihood Ratio method). We randomly divided participants' samples (by receiving date) such that 80% formed the training set and 20% formed the internal validation set. At each step, the indicator with the least contribution to the model's likelihood was removed. The process continues until the likelihood ratio test indicated that removing any further indicators would significantly change the model’s fit (P-value < 0.05). The remaining indicators were considered the best subset, which composed the optimal model. The model's cut-off value was established at 0.5, where values exceeding 0.5 were categorized as CRC, while values below 0.5 were categorized as HC. Omnibus test was used to assesses the overall significance of the model. When the P-value of the Omnibus Test was less than 0.05, it indicated that at least one indicator was significant. Hosmer–Lemeshow test was used to assess the goodness-of-fit of the model. When the P-value of the Hosmer–Lemeshow test was greater than 0.05, it indicated model's classification forecasting were consistent with reality. The diagnostic accuracy of individual indicators and the model was assessed using receiver operating characteristic (ROC) curves. Flow diagram for development of the CRC screening model had been shown in Fig. 1.

Figure 1
figure 1

Flow diagram for development of the CRC screening model. HC Healthy control, CRC Colorectal cancer.

Ethics declarations

This study received approval from the Ethics Committee of Renji Hospital Affiliated with Shanghai Jiao Tong University School of Medicine (case number RA-2022-335) and adhered to the principles outlined in the Declaration of Helsinki. Prior to their inclusion in the study, all patients provided informed consent.

Results

Characteristics of the study populations

The median age of participants in the HC and CRC groups was 65 and 66 years old, respectively. Males comprised 64.17% and 63.73% of the HC and CRC groups, respectively. Early and middle-stage tumors (Stages I, II, and III) were prevalent in the CRC group, accounting for 87.91%. Late-stage tumors (Stage IV) represented 12.09% of the CRC cases. Additionally, the tumor location distribution revealed 42.86% in the rectum and 57.14% in the colon. As the Table 1 showed.

Table 1 Clinical characteristics of the participants.

Comparison of 9 lipid indicators between HC and CRC groups

In the CRC group, the serum levels of TC, HDL-C, LDL-C, ApoA1, ApoA2, ApoB, ApoC2 and ApoC3 were significantly lower than those in the HC group (P < 0.05). The level of TG was higher in the CRC group compared to the HC group (P < 0.05), as shown in Table 2 and Fig. 2.

Table2 Comparison of Indicators levels between two groups.
Figure 2
figure 2

Comparison of lipid Indicators levels between two groups. HC Healthy control, CRC Colorectal cancer; *P < 0.05; **P < 0.01; Error bars (TC, HDL-C, LDL-C, ApoA1, ApoA2, ApoB): Mean with standard deviation; Error bars (TG, ApoC2, ApoC3): Median with interquartile range.

Comparison of 15 bile acids between HC and CRC groups

In the CRC group, the serum levels of free primary bile acids including CA, CDCA, as well as secondary bile acids such as DCA, GDCA, TDCA, LCA, GLCA, TLCA, UDCA, GUDCA, TUDCA, were significantly lower than those in the HC group (P < 0.05). However, there were no statistically significant differences in the levels of conjugated primary bile acids, including GCA, TCA, GCDCA, TCDCA between the two groups (P > 0.05), as shown in Table 2 and Fig. 3.

Figure 3
figure 3

Comparison of bile acids levels between two groups. HC Healthy control, CRC Colorectal cancer; *P < 0.05; **P < 0.01; ns (no statistically significant): P > 0.05; Error bars: Median with interquartile range.

Comparison of 6 gastrointestinal tumor markers between HC and CRC groups

In the CRC group, the serum levels of CA19-9 and CEA were significantly higher than those in the HC group (P < 0.05). However, no statistically significant differences were observed in the levels of CA125, CA242, CA50, and CA72-4 between the two groups (P > 0.05), as shown in Table 2 and Fig. 4.

Figure 4
figure 4

Comparison of tumor markers levels between two groups. HC Healthy control, CRC Colorectal cancer; *P < 0.05; **P < 0.01; ns (no statistically significant): P > 0.05; Error bars: Median with interquartile range.

Evaluation of diagnostic accuracy for CRC

Statistically significant differences were observed between the HC and CRC groups in 22 indicators, including serum TC, TG, HDL-C, LDL-C, ApoA1, ApoA2, ApoB, ApoC2, ApoC3, CA, CDCA, DCA, GDCA, TDCA, LCA, GLCA, TLCA, UDCA, GUDCA, TUDCA, CA19-9, CEA. The diagnostic accuracy for CRC was evaluated using ROC curves. Eleven indicators, including ApoA1, ApoA2, ApoC3, CDCA, DCA, TDCA, LCA, GLCA, UDCA, TUDCA, and CEA, had an area under the curve (AUC) greater than 0.7. Among these, ApoA2 showed the best diagnostic accuracy, with an AUC of 0.957, a sensitivity of 85.71%, a specificity of 93.33%, and the largest Youden's Index of 0.79. Detailed results were presented in Table 3 and Supplementary Table 3.

Table 3 Performance of the indicators in screening CRC.

Development and validation of the CRC screening model

Establishment of the model

Eighty percent of study participants’ samples were selected for forming the training set according to receiving date, including 96 cases from the HC group and 72 cases from the CRC group. Clinical characteristics of them were shown in Supplementary Table 4. The 11 indicators with AUC greater than 0.7 were included in a stepwise binary logistic regression analysis (Backward Likelihood Ratio method). When ApoA1 was removed in step 9, statistically significant difference was observed between model 8 and model 9(P = 0.024). Thus model 8 which composed of ApoA1, ApoA2, LCA, and CEA was considered as the final screening model. The model is represented as Y = 1/(1 + \({e}^{-{\text{Logit}}({\text{P}})})\), where Logit (P) = − 4.847 × ApoA1 − 1.041 × ApoA2 − 0.132 × LCA + 2.233 × CEA + 30.691. Omnibus test (χ2 = 200.79, P = 0.00) showed that indicators in the model were significant and Hosmer–Lemeshow test (χ2 = 0.93, P = 0.99) showed that the model's classification forecasting are relatively consistent with reality. The diagnostic accuracy of each step of the stepwise binary logistic regression was shown in Table 4. The steps of developing the screening model were shown in Supplementary Table 5.

Table 4 Performance of the stepwise binary logistic regression models in screening CRC.

Analysis of model efficacy

The performance of Model Y was assessed using the ROC curve as a new variable. The AUC for diagnosing CRC was 0.995 (95% CI 0.969–0.999), as illustrated in Fig. 5. The model exhibited a sensitivity of 94.44%, a specificity of 97.92%, and an accuracy rate of 96.43%, as detailed in Table 4.

Figure 5
figure 5

ROC curve of the model and the markers that make up the model in screening CRC.

Internal validation of the model

The other 20% of study participants’ samples (24 cases from the HC group and 19 cases from the CRC group) were utilized as internal validation set and assessed using Model Y. The results showed that 22 cases from the HC group and 18 cases from the CRC group were correctly classified, yielding a sensitivity of 94.74%, a specificity of 91.67%, and an overall accuracy rate of 93.02%. This accuracy rate was essentially consistent with that of the training set, as depicted in Fig. 6.

Figure 6
figure 6

The performance of the model in screening CRC. HC Healthy control, CRC Colorectal cancer.

External validation of the model

For external validation of our findings, we recruited an additional 22 CRC patients and 22 healthy volunteers to form an independent validation set. Then evaluated Model Y's diagnostic accuracy on this external validation set. The results showed that 20 cases from CRC patients and 20 cases from healthy volunteers were correctly classified, achieving a sensitivity and specificity of 90.91% each. Figure 6 illustrated these results in detail.

Discussion

Currently, the most prevalent CRC screening methods are fecal immunochemical testing (FIT) and colonoscopy. While FIT offers a non-invasive approach, it faces challenges of high false positive rates and low patient compliance during sample collection13,14. Colonoscopy, on the other hand, is not ideal for large-scale screening due to its invasive nature and potential complications. In CRC programmatic screening programs, colonoscopy is best reserved as step two of a two-stage screening cascade7,8. The strategy for the screening test in the first step of the screening cascade must consider feasibility, such as convenience and cost-effectiveness.

Blood testing emerges as a promising avenue for large-scale screening due to its minimally invasive nature. A promising strategy involves the development of multivariate classification models that integrate measurements of multiple biomarkers to calculate disease probability15,16. Such models can outperform single-analyte approaches by comprehensively capturing the intricate and multifaceted nature of cancer development, as well as the diverse metabolic, genetic, and structural alterations associated with cancer cells17,18. The application of multivariate models in cancer diagnosis is gaining traction, with numerous examples showcasing their effectiveness. FDA (U.S. Food and Drug Administration)-approved tests already exist, including a blood test based on multiple biomarkers for ovarian cancer detection and a multitarget stool DNA test for CRC screening19,20.

Researchers have delved into the potential of blood-based markers for CRC screening. For instance, Butvilovskaya et al.21 developed a CRC blood screening model employing tumor markers such as CEA, CA19-9, and CA125, achieving an AUC of 0.82. Similarly, Marín-Vicente et al.22 crafted a CRC screening model utilizing two serum protein biomarkers, ApoC3 and THBS1, analyzed through a decision tree algorithm, with an AUC of 0.83. However, the scope of indicators included in these models was somewhat limited, restricting significant advancements in diagnostic accuracy. Incorporating a variety of range of biomarkers into screening models emerges as a crucial strategy for the widespread, early detection of CRC. The inclusion of a diverse array of indicators—ranging from protein biomarkers to metabolic and transcriptomic biomarkers—enhances the model's diagnostic accuracy. Vironova et al.23 integrated 16 serum biomarkers into their model, resulting in exceptional diagnostic accuracy, with accuracy exceeding 95% and an AUC of over 0.98. Nevertheless, the model is overly complex for clinical application, necessitating the testing of an excessive number of biomarkers, which in turn increases the burden on patients. Thus, finding an optimal balance between the diversity of indicators and the model's diagnostic capability is a critical consideration. Additionally, ensuring the diagnostic reliability of blood-based CRC screening models presents a further challenge. Bhardwaj et al.15, in their analysis of 36 studies on the subject, found that most lacked external validation of the tests they proposed, highlighting an area in need of more focused attention.

In our study, we employed a comprehensive approach by integrating multiple indicators, including 9 lipids, 15 bile acids, and 6 gastrointestinal tumor markers, to simultaneously establish a screening model, which was not widely discussed in existing research. This model was refined to a core set of four high contribution indicators through stepwise logistic regression analysis. Compared to individual markers, our model demonstrates superior diagnostic accuracy with a sensitivity of 94.44%, a specificity of 97.92%, and an overall accuracy of 96.45% when the cut-off value is set at 0.5. It enables a more precise differentiation between healthy individuals and those with CRC. Importantly, the cut-off value can be adjusted according to specific clinical needs to prioritize sensitivity or specificity. For instance, a cut-off value of 0.3 increases the sensitivity to 98.61% while slightly reducing the specificity to 92.31%, as detailed in Supplementary Table 6. This adaptability allows for customized applications in different diagnostic scenarios. To ensure the model's robustness, we conducted external validation, achieving an accuracy of 90.91%. Another key advantage of the model is its flexibility. The four routine detection indicators included in the model—ApoA1, ApoA2, LCA, and CEA—can be easily incorporated into standard examinations to meet the needs of large-scale screening.

We recognize that different stages of colorectal cancer and other pathological conditions may influence the levels of circulating biomarkers, which was not involved in this study. To overcome this limitation, future research will include patients with various stages of CRC, those with gastrointestinal conditions such as inflammatory bowel disease, irritable bowel syndrome, hemorrhoids and colonic polyps, as well as those with other systemic diseases including dyslipidemia, liver, renal or heart diseases, inflammatory conditions or other cancers. A comprehensive comparative analysis of multi-omics biomarkers, including proteins, metabolites, and miRNAs, will be conducted. This enhances the practicality and applicability of the model. Moreover, we aim to achieve an optimal balance between the diversity of indicators and the diagnostic accuracy of the model. Our model will be refined by incorporating markers abnormally expressed in CRC serum, such as microRNA24, Midkine25, amino acid26 and other proteins such as albumin27. In addition, the sample size of our study is relatively small, and we intend to increase the sample size to further strengthen the validation of our model.

The ultimate validation of our model’s clinical utility will depend on conducting prospective randomized clinical trials. This step is crucial to ensure that our model is not just theoretically robust but also practically viable in a clinical setting.