Carbon price prediction based on decomposition technique and extreme gradient boosting optimized by the grey wolf optimizer algorithm

Feng, Mengdan; Duan, Yonghui; Wang, Xiang; Zhang, Jingyi; Ma, Lanlan

doi:10.1038/s41598-023-45524-2

Download PDF

Article
Open access
Published: 27 October 2023

Carbon price prediction based on decomposition technique and extreme gradient boosting optimized by the grey wolf optimizer algorithm

Mengdan Feng¹,
Yonghui Duan¹,
Xiang Wang²,
Jingyi Zhang¹ &
…
Lanlan Ma¹

Scientific Reports volume 13, Article number: 18447 (2023) Cite this article

1747 Accesses
1 Citations
2 Altmetric
Metrics details

Subjects

Abstract

It is essential to predict carbon prices precisely in order to reduce CO₂ emissions and mitigate global warming. As a solution to the limitations of a single machine learning model that has insufficient forecasting capability in the carbon price prediction problem, a carbon price prediction model (GWO–XGBOOST–CEEMDAN) based on the combination of grey wolf optimizer (GWO), extreme gradient boosting (XGBOOST), and complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is put forward in this paper. First, a random forest (RF) method is employed to screen the primary carbon price indicators and determine the main influencing factors. Second, the GWO–XGBOOST model is established, and the GWO algorithm is utilized to optimize the XGBOOST model parameters. Finally, the residual series of the GWO–XGBOOST model are decomposed and corrected using the CEEMDAN method to produce the GWO–XGBOOST–CEEMDAN model. Three carbon emission trading markets, Guangdong, Hubei, and Fujian, were experimentally predicted to verify the model’s validity. Based on the experimental results, it has been demonstrated that the proposed hybrid model has enhanced prediction precision compared to the comparison model, providing an effective experimental method for the prediction of future carbon prices.

Proposing a hybrid metaheuristic optimization algorithm and machine learning model for energy use forecast in non-residential buildings

Article Open access 20 January 2022

Developing a hybrid time-series artificial intelligence model to forecast energy use in buildings

Article Open access 21 September 2022

Application of RR-XGBoost combined model in data calibration of micro air quality detector

Article Open access 02 August 2021

Introduction

Climate change has evolved into a formidable menace to the survival of humanity in the twenty-first century. Greenhouse gases are considered a major factor contributing to global warming¹. To cope with the global warming crisis, the international community has actively reduced carbon emissions by formulating climate policies and other measures. Among them, the European Emissions Trading System (EU-ETS) was implemented in 2005, reducing carbon emissions and energy consumption². Furthermore, China plays a significant role in international climate protection as one of the top carbon emitters worldwide. China has implemented eight carbon trading pilots in various regions, namely Beijing (2013), Shanghai (2013), Guangdong (2013), Tianjin (2013), Shenzhen (2013), Chongqing (2014), Hubei (2014), and Fujian (2016), in order to reduce global emissions³.

Carbon trading has emerged as an emerging financial industry. A carbon price reflects fluctuations in supply and demand for carbon energy within the carbon emissions market, where carbon energy can be traded as a commodity⁴. Because of the uncertainty of the internal mechanism and external factors, carbon prices demonstrate nonlinear and non-stationary features^5,6. The risks associated with carbon trading are greater than those associated with traditional financial products. Accurate carbon price forecasting not only helps governments grasp the changes in market conditions and make reliable decisions, but also helps enterprises and investors grasp the characteristics of carbon prices. This will make sensible resource allocations and realize the value-added of carbon assets. As a result, it is crucial to establish a system that is stable and effective for the research of carbon prices.

In accordance with the previous literature review, carbon price research can be categorized into two classifications: models based on historical data^7,8,9 and models based on influencing factors^10,11,12,13.

Grounded on historical data, carbon price forecasting methods can be classified into three categories: statistical and econometric methods, artificial intelligence (AI), and integration methods.

In the past, statistical and econometric methods were extensively employed for forecasting carbon prices as classical time series forecasting methods. Main statistical methods are the autoregressive integrated moving average model (ARIMA)¹⁴, generalized autoregressive conditional heteroskedasticity model (GARCH)¹⁵, gray model (GM)¹⁶, etc. For instance, Carolina et al. (2013) employed an ARIMA model in order to forecast carbon prices, ultimately achieving more accurate predictive outcomes¹⁷. According to Dutta (2018), an exponential GARCH model was used for forecasting carbon price volatility, and outliers were processed to improve accuracy¹⁸. Under the assumption of linearity, the statistical and econometric methods perform well for short-term forecasting, but when forecasting nonlinear, non-stationary time series of carbon prices, the prediction accuracy is not satisfactory¹⁹.

As a result, AI that does not require linear assumptions is broadly utilized across different sectors. For example, credit risk prediction²⁰, disease treatment²¹, and traffic congestion^22,23. For carbon price prediction, least squares support vector machines (LSSVM) and artificial neural networks (ANN) are commonly used. Using 1074 daily carbon price results, Atsalakis (2016) developed a neural network (NN) model to predict time series. ANN was found to be the most effective method for predicting carbon prices based on the final results²⁴. Zhu et al. (2016) introduced an adaptive multiscale integrated learning approach grounded in LSSVM to effectively capture the non-stationary and non-linear attributes of carbon prices. The findings demonstrated that their proposed model surpassed the performance of the ARIMA and GARCH models²⁵. Despite the fact that AI exceeds traditional statistical models in forecasting non-linear and non-stationary data, a single AI model fails to possess sufficient forecasting stability and does not meet researchers’ expectations for accurate carbon price predictions across different markets²⁶.

Given the constraints of conventional statistical approaches in handling non-stationary feature data and the shortcomings of a single AI model, experts have started to focus on researching integrated methods to boost data analysis and forecasting precision. A number of decomposition methods have been proposed based on different theoretical foundations, including the wavelet transform (WT)²⁷, variational mode decomposition (VMD)²⁸, and ensemble empirical mode decomposition (EEMD)²⁹. E et al. (2019) realized that carbon valence has nonlinear and nonstationary properties. To address this issue, they combined VMD with a gated recurrent unit (GRU) to predict carbon prices’ future trends. Experimental results confirmed its validity and reliability³⁰. Jinpei Liu et al. (2019) employed empirical mode decomposition (EMD) and a reconstruction algorithm to transform the original data into three subseries of varying frequencies. Subsequently, they individually analyzed these three types of data using ARIMA, partial least squares (PLS), and NN methods. The findings demonstrated the superior predictive performance of the model³¹. Using EEMD to preprocess the data, Zhou et al. (2018) constructed different combinations of models to identify different frequencies. The existing hybrid models, although they enhance carbon price prediction accuracy, have drawbacks³². For example, existing hybrid models usually have model subseries obtained from decomposition without considering noise. This can reduce prediction accuracy and efficiency²⁹.

Carbon prices are impacted by a combination of historical data and external factors. The existing literature primarily utilizes carbon price time series data for the modeling process. However, the dynamics of carbon trading prices are influenced by various factors, including energy factors, macroeconomic factors, and industry structures¹¹. In general, since external factors can be analyzed, carbon price forecasting built upon multiple influencing factors is important for carbon market research. Therefore, carbon price prediction models for influencing factors are favored by scholars. Using oil, coal, and natural gas prices as the basis, Tsai and Kuo (2013) devised an ant-based radial basis function network (ARBFN) model for carbon price prediction. The inclusion of multiple influencing factors in carbon price forecasting models can indeed pose challenges due to the potential for error accumulation. When considering multiple factors, the complexity of the model increases, and uncertainties associated with each factor can accumulate throughout the forecasting process³³.

Reviewing previous studies, we identify potential research gaps in the prediction of carbon prices. One is that most carbon price forecasting models rely only on past carbon price data series. They ignore the impact of external factors on the carbon market. This limitation may result in models that do not adequately take into account the full range of market conditions when forecasting carbon prices. Second, most current carbon price forecasting models fail to fully explore and utilize other useful information. Useful information means that after the model prediction, there are still a large number of nonlinear residual sequences, which are not random walks³⁴ and still contain carbon price information. Ignoring the residual series leads to the potential problem of incomplete information in predicting carbon prices. To this end, it is of particular importance that the above issues are addressed and a new perspective on carbon price forecasting is proposed.

In order to bridge these gaps, this study first established the index system of influencing factors of carbon price and selected indicators by the random forest method to find out the main factors affecting carbon price, so as to improve the prediction accuracy of the model. Secondly, XGBOOST is used to establish the carbon price prediction model. Meanwhile, with the objective of avoiding the prediction error caused by the parameter setting of the XGBOOST model, GWO is used to find the optimization of the model parameters. To increase the precision of model predictions, the residual series of XGBOOST predictions is corrected using the CEEMDAN method, and a combined GWO–XGBOOST–CEEMDAN model is derived. The contributions can be summarized as follows:

(1)
In the majority of prior studies, carbon price forecasts relied on historical time series data on carbon prices. This ignores the effects of multiple factors when predicting carbon prices, so there are limits to the information that can be provided and the extent to which carbon markets can be managed. In this study, multiple influencing factors are considered in conducting carbon price forecasts with the aim of addressing the problem of carbon price forecasting. In order to develop a richer indicator system that is more appropriate to China’s national conditions, the carbon price time series data as well as the various influencing factors are treated as candidate input features for carbon price modeling.
(2)
In this study, Partial Autocorrelation Function (PACF) and Random Forest (RF) are introduced as feature selection methods to build carbon price prediction models in accordance with numerous influencing factors and reduce the influence of redundant information between features. A significant improvement has been made in the model’s prediction performance.
(3)
Most previous hybrid models first decompose the data and then perform carbon price prediction studies. However, this study adopts a different approach by first predicting carbon prices and then decomposing the residual series. After the carbon price information is predicted by the strong master model, the useful information of the residual sequence is difficult to obtain, so the CEEMDAN algorithm is used to further process the residual information and decompose it into modal information that is easy to extract and a sequence that is more difficult to extract. This is to dig deeper into residual effective information. According to the experiment, carbon price prediction is more accurate and practical than most previous studies. The method of prediction and then decomposition offers innovative thought for carbon price prediction research, and it will serve as a strong reference in the future.

Algorithm introduction

Feature election

Random forest

After the initial selection of 11 metrics, feature screening is performed next. It can enhance the model’s ability to generalize, reduce the risk of overfitting, reduce the computational complexity of the model, etc. Common feature selection methods are Gray correlation, Pearson correlation coefficient, and random forests (RF). Gray correlation and Pearson correlation coefficient are both linear relationship-based methods, while RF can handle more complex nonlinear relationships. This means RF can select features in a wide range of situations. As a result, in this paper, the RF method is used for screening carbon price primary indicator systems.

Based on the results of the RF method, the primary features are ranked in terms of importance and then selected. Consider a sample size of $A$ and a feature dimension of $m$. Provide a set of training samples $\left\{({x}_{1},{y}_{1}),\cdots ,({x}_{N},{y}_{N})\right\}$ and create a self-help sample set ${C}_{t}$ of size $A$; ${K}_{t}$ is obtained by classification and regression tree (CART) on ${C}_{t}$ ; Taking a random sample of ${m}_{try}=\sqrt{m}$ features from each tree and selecting the most significant ${m}_{try}$ features for node splitting; Analyzing whether $t$ satisfies $t\le ntree$ until the loop is not exited, and then generating $G=Uniform\left(\left\{{K}_{t}\right\}\right)$.

In the calculation of feature importance, the Gini Index is used as a segmentation function to calculate "Gini Importance" as the degree of importance of a feature. This can be expressed as follows:

$$Gini(C)=1-{{\sum }_{i=1}^{\left|E\right|}\left[{F}_{i}\right]}^{2}$$

(1)

$C$ represents the sample set;${F}_{i}$ represents the probability of belonging to the $ith$ class in the sample set $C$ ; There are a number of sample classes in $E$ . The Gini index of the sample set $C$ is defined when feature $G$ is known.

$$Giniindex(C,G)={\sum }_{H=1}^{H}\frac{\left|{C}^{H}\right|}{C}Gini({C}^{H})$$

(2)

$H$ represents the number of features $G$ values, i.e., $C$ is divided into $H$ subsets according to the feature $G$ values $\left\{{C}^{1},{C}^{2},...{C}^{H}\right\}$, and the samples within each subset are of the same feature $G$ value. $G$ feature that has the smallest Gini index after division is considered to be the optimal feature in the selection process.

Partial auto-correlation function

PACF is a statistical tool for time series analysis that helps determine the relationship between each observation in a time series and its lag values. Its function is to recognize the order of the AR (Autoregressive) model in a time series, i.e., how many lags need to be considered in that model. The PACF model actually adjusts the autocorrelation function (ACF) by eliminating the part already explained by the previous lags so that the remaining part more accurately reflects the relationship between the observations and the lags at the current moment. $({X}_{t},{X}_{t+v}|{X}_{t+1}|,\cdot \cdot \cdot ,{X}_{\left(t+v-1\right)}$ represents the conditional correlation between ${X}_{t}$ and ${X}_{t+v}$ after removing the effects of the intervening variables ${X}_{t+1},\cdot \cdot \cdot ,{X}_{\left(t+v-1\right)}$, i.e., the partial autocorrelation between ${X}_{t}$ and ${X}_{t+v}$.

CEEMDAN model

Empirical mode decomposition (EMD) is to decompose the nonlinear and non-stationary raw data into inherent mode functions (IMF_S) with various fluctuation scales. However, due to the intermission of the raw data, mode confusion is easy to occur. This will affect the decomposition effect. Wu³⁵ proposed an ensemble empirical mode decomposition (EEMD) method by adding a certain degree of Gaussian white noise to the original data for repeated decomposition. Although the mode overlap phenomenon can be effectively solved, residual white noise still exists in the component of the eigenmode function derived by this method, resulting in low reconstruction accuracy. Building upon this, Torres³⁶ moved to the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) method, which addresses the issue of significant reconstruction errors in the EEMD method by introducing adaptive white noise at each stage. Therefore, in this essay, the CEEMDAN method is used to forecast each component of the eigenmode function and the trend term separately.

CEEMDAN can be broken down as follows:

Step 1 As a result of adding a Gaussian white noise sequence to the residual sequence, an updated sequence with noise is obtained:

$$\overline{{y}_{i}}\left(t\right)=y\left(t\right)+\sigma {n}_{i}\left(t\right),i=\mathrm{1,2}...N$$

(3)

where $y\left(t\right)$ is the residual sequence, and $\overline{{y}_{i}}\left(t\right)$ is the new sequence with the addition of Gaussian white noise; ${n}_{i}\left(t\right)$ denotes the white noise added to the residual data; σ is the adaptive coefficient.

Step 2 EMD decomposition is performed on the new sequence with white noise added to obtain N modal components, and the first modal component of CEEMDAN is obtained by the overall averaging of the N modal components as follows:

$$im{f}_{1}\left(t\right)=\frac{1}{N}{\sum }_{i=1}^{N}im{f}_{1i}\left(t\right)$$

(4)

At this point, ${R}_{1}\left(t\right)$ is the residual component.

$$ {\text{R}}_{1} \left( {\text{t}} \right) = \overline{{{\text{y}}_{{\text{i}}} }} \left( {\text{t}} \right) - {{imf}}_{1}^{\mathrm{^{\prime}}} \left( {{t}} \right) $$

(5)

Step 3 The adaptive white noise sequence $\sigma {n}_{i}\left(t\right)$ is added to ${R}_{1}\left(t\right)$ to form a new sequence ${R}_{1}\left(t\right)+\sigma {E}_{1}\left({n}_{i}\left(t\right)\right)$ with noise, where ${E}_{j}\left(\cdot \right)$ is the jth eigenmodal component obtained after EMD decomposition. At this point, the EMD decomposition is performed on the new sequence and averaged to obtain the second modal component and the residual component as follows:

$$ im{f}_{2} \left( {{t}} \right) = \frac{1}{{\text{N}}}\mathop \sum \limits_{{{\text{i}} = 1}}^{{\text{N}}} {\text{E}}_{1} \left( {{\text{R}}_{1} \left( {\text{t}} \right) + {\upsigma }_{1} {\text{E}}_{1} \left( {{\text{n}}_{{\text{i}}} \left( {\text{t}} \right)} \right)} \right) $$

(6)

$${R}_{2}\left(t\right)={R}_{1}\left(t\right)-im{{f}_{2}}^{\mathrm{^{\prime}}}\left(t\right)$$

(7)

Step 4 Repeat the above three steps to obtain the (j + 1)th modal component and the jth residual component:

$$im{f}_{j+1}\left(t\right)=\frac{1}{N}{\sum }_{i=1}^{N}{E}_{1}\left({R}_{j}\left(t\right)+{\sigma }_{j}{E}_{j}\left({n}_{i}\left(t\right)\right)\right)$$

(8)

$${R}_{j}\left(t\right)={R}_{j-1}\left(t\right)-im{{f}_{j}}^{\mathrm{^{\prime}}}\left(t\right)$$

(9)

Step 5 Repeat the above steps until the CEEMDAN can no longer be decomposed by EMD. Finally, the original sequence $\mathrm{y}(\mathrm{t})$ is decomposed into multiple eigenmodal components and a trend component.

$$ {\text{y}}\left( {\text{t}} \right) = {{imf}}\left( {\text{t}} \right) + {\text{R}}_{{{\text{es}}}} \left( {\text{t}} \right) $$

(10)

After CEEMDAN has decomposed the residual series, the GWO–XGBOOST model is applied to each eigenfunction component. The final residual forecast is derived by linearly combining the results of each component.

XGBOOST model

Extreme gradient boosting (XGBOOST) was developed by Chen et al.³⁷ in 2016, which integrates a linear scale solver with a categorical regression tree learning algorithm. The model combines models with low prediction accuracy through certain strategies. The purpose of this is to construct an integrated model that is more accurate in terms of prediction. During the model training process, XGBOOST optimizes the boosting process. Each iteration generates an updated decision tree to fit the residuals generated in the previous iteration. XGBOOST can continuously improve its prediction accuracy and generalization capacity through iterative optimization. While traditional gradient boosting decision tree (GBDT) methods utilize only first-order derivatives, XGBOOST does a second-order Taylor expansion of the loss function, controls model complexity by introducing regularization terms to avoid overfitting problems, and employs a more refined evaluation approach when splitting nodes to better capture the nonlinear relationships between features. In recent years, the XGBOOST model has shown superior performance in financial risk control, medical health, natural language processing, and other fields. This model is based on the following mathematical principles:

An integration model for the definition tree can be described as follows:

$${\widehat{y}}_{i}={\sum }_{m=1}^{M}{f}_{m}({x}_{i}),{f}_{m}\in F$$

(11)

where ${\widehat{y}}_{i}$ is the prediction value; $M$ is the number of decision trees; $F$ is the tree selection space; ${x}_{i}$ is the first $i$ input feature.

XGBOOST’s loss function is as follows:

$$Q={\sum }_{i=1}^{n}l({y}_{i},{\widehat{y}}_{i})+{\sum }_{m=1}^{M}\theta \left({f}_{m}\right)$$

(12)

The first part of the function is the prediction error between the predicted value and the real training value of the XGBOOST model, and the second part represents the complexity of the tree, which is mainly used to control the regularization of the model complexity:

$$\theta ({f}_{m})=\gamma T+\frac{1}{2}\tau {\Vert \omega \Vert }^{2}$$

(13)

where $\gamma $ and $\tau $ are penalty factors.

By adding an incremental function ${f}_{t}\left({x}_{i}\right)$ to Eq. (13), the value of the loss function is minimized. Then the objective function of the $t$ th time is

$${Q}_{\left(t\right)}={\sum }_{i=1}^{n}l({y}_{i},{\widehat{y}}_{i})+{\sum }_{m=1}^{M}\theta \left({f}_{m}\right)={\sum }_{i=1}^{n}l\left({y}_{i},{{\widehat{y}}_{i}}^{t-1}+{f}_{t}\left({x}_{i}\right)\right)+\theta \left({f}_{t}\right)$$

(14)

The second-order Taylor expansion of Eq. (15) is used to approximate the objective function, and the set of samples in each child of the $j$ tree is defined as ${I}_{j}=\left\{i\left|q\left({x}_{i}=j\right)\right.\right\}$. At this point the ${Q}_{\left(t\right)}$ can be approximated as

$${Q}_{\left(t\right)}\cong \sum_{j=1}^{T}\left[\left({\sum }_{i\in {I}_{j}}{g}_{i}\right){\omega }_{j}+(1/2)\left({\sum }_{i\in {I}_{j}}{h}_{i}+\tau \right){{\omega }_{j}}^{2}\right]+\gamma T$$

(15)

where ${g}_{i}={\partial }_{{{\widehat{y}}_{i}}^{t-1}}l\left({y}_{i},{{\widehat{y}}_{i}}^{t-1}\right)$ is the first order derivative of the loss function; ${h}_{i}={{\partial }^{2}}_{{{\widehat{y}}_{i}}^{t-1}}l\left({y}_{i},{{\widehat{y}}_{i}}^{t-1}\right)$ is the second order derivative of the loss function. Defining ${G}_{i}=\sum i\in {I}_{j}{g}_{i}$, ${H}_{i}={\sum }_{i\in {I}_{j}}{h}_{i}$ then we have:

$${Q}_{\left(t\right)}\cong {\sum }_{j=1}^{T}\left[{G}_{j}{\omega }_{j}+(1/2)\left({H}_{j}+\tau \right){{\omega }_{j}}^{2}\right]+\gamma T$$

(16)

The partial derivative of $\omega $ yields

$${\omega }_{j}=-{G}_{j}/{(H}_{j}+\tau )$$

(17)

By incorporating weights into the objective function, we get

$${Q}_{\left(t\right)}\cong -(1/2){\sum }_{j=1}^{T}{{G}_{j}}^{2}/({H}_{j}+\tau )+\gamma T$$

(18)

A large portion of the model’s performance is determined by parameter selection during the training process of the XGBOOST model. There are 23 hyperparameters in the XGBOOST algorithm, mainly divided into general parameters for macroscopic function control, booster parameters for booster detail control, and learning target parameters for training target control. The GWO–XGBOOST combinatorial model combines the three hyperparameters that have a significant impact on the performance of XGBOOST (learning_rate, n_estimators, and max_depth) as the position vector of the head wolf $\alpha $ in the GWO algorithm and continuously updates them through the iterations of the GWO algorithm to continuously find the optimal position until the global optimal position is output as the final parameter of the XGBOOST model.

GWO model

A pack intelligence optimization algorithm, the grey wolf optimizer (GWO), based on the predatory behavior of grey wolves, was proposed by Mirjalili et al.³⁸ in 2014, inspired by the predatory behavior of grey wolves. The optimization process of the GWO algorithm can be analogized to the hunting behavior of the gray wolf pack. Among them, α, $\beta $, and $\delta $ wolves with the highest social level in each generation of the population act as the leaders of the gray wolf pack. A predator searches, encircles, and attacks prey to achieve its optimization goal. GWO has strong global convergence ability, robustness, and fewer parameters to adjust, and is now used in many fields for optimization problems.

Firstly, the mathematical definition of how a wolf pack searches for and surrounds its prey is as follows:

$$A=\left|B\cdot {F}_{p}\left(t\right)-F\left(t\right)\right|$$

(19)

$$F\left(t+1\right)={F}_{p}\left(t\right)-C\cdot A$$

(20)

$$c=2-2D/E$$

(21)

$$C=2c\cdot {r}_{1}-c$$

(22)

$$B=2\cdot {r}_{2}$$

(23)

where $F\left(t\right)$ is the position of the prey after the $t$ th iteration; ${F}_{P}\left(t\right)$ is the position of the gray wolf at the $t$ iteration; $A$ is the distance between the gray wolf and the prey; $F\left(t+1\right)$ is the update of the position of the gray wolf; $C$ and $B$ are the coefficient vectors;$c$ is the convergence factor whose value decreases linearly from 2 to 0 with the number of iterations, $D$ is the number of previous iterations, and $E$ is the maximum number of iterations; $r_{1}$ and $r_{2}$ are the random numbers between [0,1].

Secondly, the prey is finally determined by constantly updating the positions of the three optimal wolves α, $\beta $, and $\delta $. The mathematical definition of the hunting process of the gray wolf pack is

$${A}_{\alpha }=\left|{B}_{1}\cdot {F}_{\alpha }\left(t\right)-F\left(t\right)\right|$$

(24)

$${A}_{\beta }=\left|{B}_{2}\cdot {F}_{\beta }\left(t\right)-F\left(t\right)\right|$$

(25)

$${A}_{\delta }=\left|{B}_{3}\cdot {F}_{\delta }\left(t\right)-F\left(t\right)\right|$$

(26)

$${F}_{1}\left(t+1\right)={F}_{\alpha }\left(t\right)-{C}_{1}\cdot {A}_{\alpha }$$

(27)

$${F}_{2}\left(t+1\right)={F}_{\beta }\left(t\right)-{C}_{2}\cdot {A}_{\beta }$$

(28)

$${F}_{3}\left(t+1\right)={F}_{\delta }\left(t\right)-{C}_{3}\cdot {A}_{\delta }$$

(29)

$$F\left(t+1\right)=({F}_{1}\left(t+1\right)+{F}_{2}\left(t+1\right)+{F}_{3}\left(t+1\right))/3$$

(30)

where ${F}_{\alpha }\left(t\right)$, ${F}_{\beta }\left(t\right)$ and ${F}_{\delta }\left(t\right)$ are the positions of $\alpha $, $\beta $ and $\delta $ wolves when the population is iterated to generation t; $F\left(t\right)$ is the position of individual gray wolves in generation t; ${C}_{1}$ and ${B}_{1}$, ${C}_{2}$ and ${B}_{2}$, ${C}_{3}$ and ${B}_{3}$ are the coefficient vectors of $\alpha $, $\beta $ and $\delta $ wolves, respectively; ${F}_{1}\left(t+1\right)$,$ {F}_{2}\left(t+1\right)$ and ${F}_{3}\left(t+1\right)$ are the positions of $\alpha $, $\beta $ and $\delta $ wolves after $\left(t+1\right)$ iterations, respectively; $F\left(t+1\right)$ is the position of the next generation of gray wolves.

GWO–XGBOOST–CEEMDAN model

To improve carbon price prediction, we propose to combine the CEEMDAN, XGBOOST, and GWO models to build the GWO–XGBOOST–CEEMDAN model. The general idea is as follows: First, the GWO–XGBOOST model is established, and the GWO algorithm is used for optimizing the parameters of the XGBOOST model. Secondly, the CEEMDAN method is applied to decompose the residual series of the GWO–XGBOOST model to establish the GWO–XGBOOST–CEEMDAN hybrid model. Finally, the predicted values and the accumulated values of the residual predictions are summed up to get the final prediction results of the model. Figure 1 illustrates the specific process.

Data description

Data source

Accurate carbon price forecasts smooth investment decisions and maintain carbon market stability. There are big differences between China’s carbon trading pilots. The Hubei carbon trading market is the only carbon trading market in central China¹². In addition, the Guangdong carbon market was officially launched in 2013, setting five first places in China’s carbon market trading³⁹. Fujian is the first ecological civilization demonstration zone in China. The carbon market is aligned with the overall idea of the national carbon market, and it is the first pilot to adopt carbon verification standards and guidelines issued by the state. In particular, the data direct reporting system is completely consistent with the national system under construction standards, and the construction starting point is high^40,41. To sum up, this paper chooses Guangdong, Hubei, and Fujian carbon trading markets as research objects. In this paper, we collect data on the three carbon markets from the Choice financial terminal and the Wind database. The selected carbon prices take into account public holidays, differences in trading hours, and missing values of variables at home and abroad. In the above data, the Bohai Sea Power Coal Price Index and Natural Gas Market Quotation are weekly and ten-day data, and Eviews software is used to convert them into daily data. A hybrid model is evaluated by using 80% of the data for training and 20% for testing. The carbon price information for the three trading markets is shown in Table 1, and Table 2 presents descriptive statistics for each indicator.

Table 1 Carbon trading market data information.

Subjects

Abstract

Similar content being viewed by others

Proposing a hybrid metaheuristic optimization algorithm and machine learning model for energy use forecast in non-residential buildings

Developing a hybrid time-series artificial intelligence model to forecast energy use in buildings

Application of RR-XGBoost combined model in data calibration of micro air quality detector

Introduction

Algorithm introduction

Feature election

Random forest

Partial auto-correlation function

CEEMDAN model

XGBOOST model

GWO model

GWO–XGBOOST–CEEMDAN model

Data description

Data source

ADF inspection

Data pre-processing

Four aspects that affect the price of carbon

Macroeconomics

Energy prices

International carbon markets

Weather conditions

Experimental results and discussion

Evaluation indicators

Experiment one: comparison of this paper’s model with different benchmark models

Algorithm table

Model parameter setting

Screening analysis of carbon price influencing factors

Carbon price forecast results I

Experiment two: comparison of this paper’s model with different input feature models

Algorithm table

Carbon price forecast results II

Analysis of experimental results

Discussion

Validation on other data sets

DM test

Limitations of the current study and future work

Impact on sustainability

Feature importance analysis

Conclusions

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links