Introduction

A new pandemic appeared at the beginning of 2019 till March 2020 called the COVID-19 (Feroze1). Because of the lockout, COVID-19 has the greatest impact on human life and the economy. Pakistan is the 12th most impacted country in the world as a result of COVID-19 (Khan et al.2).

Mathematical models for the analysis of infectious disease transmission are currently omnipresent. Such models’ play a significant role in assisting with quantifying conceivable irresistible aliment control. Several models are available for infectious disease concerning compartmental models, beginning from the classical SIR model to more complex proposals (Ndaïrou et al.3).

To model the count data sets there are several classical probability distributions such as Binomial, Poisson, Geometric, and Negative Binomial distributions but these models do not provide a better fit for the over-dispersed nature of data sets. Hence one way to deal with such data sets is to discretize the continuous model dealing with the specific behavior to have a better fit. Discretization has attained much attention in the last few decades due to its applicability and better fitting for count data analysis.

In past several discrete distributions have been introduced and studied, such as discrete Weibull distribution discrete beta exponential distribution by Nekoukhou et al.4, two-parameter discrete Lindley distribution by Hussain et al.5, Discrete Marshall–Olkin inverse Toppe–Leone with application to COVID-19 data has been obtained by Almetwally et al.6. discrete weighted exponential distribution by Rasekhi et al.7 exponentiated discrete Lindley distribution by El-Morshedy et al.6, discrete Burr Hutke distribution by El-Morshedy et al.8, discrete Marshall–Olkinin Weibull distribution by Opone et al.9), see Almetwally et al.10, discrete Marshall–Olkinin alpha power inverse Lomax distribution by Almetwally et al.11, discrete inverted Topp–Leone distribution by Eldeeb et al.12, discrete Ramos–Louzada distribution by Eldeeb et al.13, discrete type-II half logistic exponential distribution Ahsan-ul-Haq et al.14, discrete power-Ailamujia distribution by Alghamdi et al.15, Poisson XLindley distribution Ahsan-ul-Haq et al.16, Poisson moment exponential distribution Ahsan-ul-Haq17 and discrete moment exponential distribution by Afify et al.18.

Discrete extended odd Weibull exponential with the application of COVID-19 Mortality Numbers in the Kingdom of Saudi Arabia and Latvia has been introduced by Nagy et al.19. The pmf of the new model for a mixture representation of a geometric model has been obtained by El-Morshedy et al.20.

All these discrete probability models are introduced using the survival discretization approach. Let a random variable X be associated with a continuous probability distribution having survival function \(S\left(x\right)\). The probability mass function (pmf) of a discrete random variable based on discretization is

$$P\left(X = x\right)= S\,\left(x\right)- S\,\left(x+1\right), \,\,\,\,\,\,\,\,\,x=\mathrm{0,1},\mathrm{2,3},\dots$$
(1)

The primary purpose of this study is to introduce a new flexible probability distribution for modelling across over-dispersed data sets. The mathematical properties of the new distribution, such as its simple closed-form expressions for the pmf, cdf, moments, and other characteristics, are obtained. The maximum likelihood approach is used to estimate the model parameters. To suggest a new alternative approach to model over dispersed data sets, the DMOLBE distribution applied to the number of deaths due to Covid-19 data sets. Consequently, the DMOLBE model's primary goals are:

  • The fact that this distribution provides the several hazard rate forms, such as declining, growing, or increasing-constant, sets it apart from many other one- or two-parameter discrete distributions. Because of these hazard rates, the suggested model can be used to model a variety of data sets.

  • It provides a variety of PMF shapes suitable for modelling symmetric, positively skewed, or negatively skewed data that may not be successfully modelled by other competitor models.

  • The introduction of a number of statistical and reliability traits, such as moments, probability functions, reliability indices, hazard functions, reverse hazard rate, second rate of Failure, etc.

  • In comparison to other discrete distribution models in the literature, analysis results from two practical applications revealed that the DMOLBE distribution matches the supplied data sets satisfactorily;

  • In the presence of gathered data, maximum likelihood and Bayesian estimation methods are taken into consideration to estimate the specified parameters.

  • The effectiveness of the acquired estimators is assessed using lengthy Monte Carlo simulations and a variety of accuracy metrics, including mean squared errors and absolute biases. It would seem plausible to suggest that approaches for parameter estimation are adequate and efficient.

The study is divided into the following sections: “Methodology” is based on the mathematical characteristics and derivation of the discrete Marshall–Olkinin Length Biased Exponential distribution. “Parameter estimation” presents maximum likelihood estimation via an extensive simulation study. “Bayesian estimation” discusses the results for all models. Finally, in “Results and discussion”, we bring the research to a close.

Methodology

In this section, we introduced a new discrete distribution, derived its statistical properties, estimate the model parameters using the maximum likelihood approach.

The DMOLBE distribution and its properties

Let X be a random variable connected with the Ahsan-ul-Haq et al.21 presented Marshall–Olkinin Length Biased Exponential distribution. The MOLBE distribution's probability density function is:

$$f\,\left(x\right)=\frac{\gamma \frac{x}{{\beta }^{2}}{e}^{-\left(\frac{x}{\beta }\right)}}{{\left[1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right]}^{2}}, x>0,\gamma >0,\beta >0.$$
(2)

The associated survival function is

$$S\,\left(x\right)=\frac{\gamma \left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}.$$
(3)

The DMOLBE distribution obtained using Eqs. (1) and (3), the pmf of the DMOLBE distribution is given

$$P\left(x\right)=\frac{\gamma \left[\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}- \left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right] }{\left\{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\} \left\{1-\left(1-\gamma \right)\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}}.$$
(4)

The cdf of DMOLBED is as follows

$$F\left(x\right)=\frac{ \left\{1-\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)} \right\} }{ \left\{1-\left(1-\gamma \right)\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}}, x>0,\gamma >0,\beta >0.$$
(5)

where \(\gamma\) is shape and \(\beta\) is scale parameter.

Figure 1 depicts the behavior of the probability mass function of the DMOLBE distribution, which varies with parameter values. The DMOLBE distribution is clearly declining, positively skewed, and symmetric, as seen above. It demonstrates the suggested distribution's versatility in dealing with data of varying behaviour.

Figure 1
figure 1

The pmf plots of DMOLBE distribution.

Survival and hazard function

The survival function of DMOLBED is as follows

$$S\left(x\right)=\frac{ \gamma \left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)} }{ \left\{1-\left(1-\gamma \right)\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}}.$$
(6)

The hazard function (hrf) of DMOLBE is given as follows

$$h\left(x\right)=\frac{\left[\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}- \left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right] }{\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)} \left\{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\} }.$$
(7)

Figure 2 shows the behavior of hazard function for different values of parameters which is increasing and decreasing which shows the flexibility of the model.

Figure 2
figure 2

The discrete hrf plots of DMOLBE distribution.

The second rate of failure

The second rate of failure of DMOLBE is defined as

$$SRF=\mathrm{log}\left[\frac{G\left(x\right)}{G\left(x+1\right)}\right],$$
$$SRF=\mathrm{log}\left[\frac{\left(\beta +x+1\right){e}^{\left(\frac{1}{\beta }\right)} \left\{1-\left(1-\gamma \right)\left(1+\frac{x+2}{\beta }\right){e}^{-\left(\frac{x+2}{\beta }\right)}\right\} }{ (\beta +x+2) \left\{1-\left(1-\gamma \right)\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}}\right].$$
(8)

Reverse hazard rate

The reverse Hazard rate of DMOLBE is defined as:

$${r}^{*}\left(x\right) = \frac{P\left(x\right)}{F\left(x\right)}=\frac{\gamma \left[\left\{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\}- \left\{\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x}{\beta }+1\right)}\right\}\right] }{\left\{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\} \left\{1-\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}}.$$
(9)

Recurrence formula

$$\frac{P(x+1)}{P(x)}= \frac{\left[\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}- \left(1+\frac{x+2}{\beta }\right){e}^{-\left(\frac{x+2}{\beta }\right)}\right] \left\{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\}}{ \left\{1-\left(1-\gamma \right)\left(1+\frac{x+2}{\beta }\right){e}^{-\left(\frac{x+2}{\beta }\right)}\right\}\left[\left\{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\}- \left\{\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}\right]}.$$
(10)

Probability generating function and moments

Let X be a discrete random variable, then the probability generating function of the DMOLBE distribution is given as follows:

$${G}_{x}\left(z\right)=\sum_{x=0}^{\infty }{Z}^{x}P\left(X\right)=1+\gamma \left(z-1\right)\sum_{x=1}^{\infty }{z}^{x-1} \frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}.$$
(11)

Differentiating \({G}_{x}\left(Z\right)\) with respect to \(Z\) and setting \(Z=1\), we can obtain the factorial moments as

$${G}_{x}^{\prime}\left(1\right) = \gamma \sum_{x=1}^{\infty }\frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}},$$
$${G}_{x}^{{\prime}{\prime}}\left(1\right) =2\gamma \sum_{x=1}^{\infty }(x-1) \frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}},$$
$${G}_{x}^{{\prime}{\prime}{\prime}}\left(1\right) = 3 \gamma \sum_{x=1}^{\infty }\left(x-1\right)\left(x-2\right) \frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}},$$
$${G}_{x}^{{\prime}{\prime}{\prime}{\prime}}\left(1\right) = 4 \gamma \sum_{x=1}^{\infty }\left(x-1\right)\left(x-2\right)(x-3) \frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}},$$

The factorial moments can be used to compute moments about the origin.

$$\mu ={G}_{x}^{\prime}\left(1\right)=\gamma \sum_{x=1}^{\infty }\frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}},$$
$${\mu }_{2}^{\prime}= {G}_{x}^{{\prime}{\prime}}\left(1\right) + {G}_{x}^{\prime}\left(1\right),$$
$${\mu }_{2}^{\prime}=\gamma \sum_{x=1}^{\infty }(2x-1) \frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}},$$
$${\mu }_{3}^{\prime} = {G}_{x}^{{\prime}{\prime}{\prime}}\left(1\right)+ 3{G}_{x}^{{\prime}{\prime}}\left(1\right) + {G}_{x}^{\prime}\left(1\right),$$
$${\mu }_{3}^{\prime}=\gamma \sum_{x=1}^{\infty }\left(3{x}^{2}-3x+1\right) \frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}},$$
$${\mu }_{4}^{\prime}= {G}_{x}^{{\prime}{\prime}{\prime}{\prime}}\left(1\right)+ 6{G}_{x}^{{\prime}{\prime}{\prime}}\left(1\right) + 7{G}_{x}^{{\prime}{\prime}}\left(1\right) + {G}_{x}{\prime}\left(1\right),$$
$${\mu }_{4}^{\prime}=\gamma \sum_{x=1}^{\infty }\left(4{x}^{3}-6{x}^{2}+4x-1\right) \frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}},$$

Now variance is

$${\sigma }^{2}={\mu }_{2}^{\prime}-{\left({\mu }_{1}^{\prime}\right)}^{2},$$
$${\sigma }^{2}=\gamma \sum_{x=1}^{\infty }(2x-1) \frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}- {\left(\gamma \sum_{x=1}^{\infty }\frac{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}}\right)}^{2},$$

and the coefficients of skewness (CS) and kurtosis (CK) may be computed as follows

$$CS=\frac{{\mu }_{3}^{\prime}-3{\mu }_{2}^{\prime}\upmu +2{\upmu }^{3}}{{\left({\sigma }^{2}\right)}^\frac{3}{2}},$$
$$CK=\frac{{\mu }_{4}^{\prime}{-4\mu }_{3}^{\prime}\upmu +6{\mu }_{2}^{\prime}{\upmu }^{2}-3{\upmu }^{4}}{{\left({\sigma }^{2}\right)}^{2}}.$$

By Table 1 and Figs. 3, 4, and 5 show different measures of moment with different values of parameters.

Table 1 Different measures of moment with different values of parameters.
Figure 3
figure 3

Plots of mean, variance, skewness and kurtosis of DMOLBED \(\left(\gamma =0.5\right)\)

Figure 4
figure 4

Plots of mean, variance, skewness and kurtosis of DMOLBED \(\left(\beta =0.5\right)\)

Figure 5
figure 5

Plots of dispersion index of DMOLBED.

The corresponding Dispersion Index (DI) is defined as

$$DI=\frac{Variance\,of\,DMOLBED}{Mean\,of\,DMOLBED}$$

The DI indicates whether a distribution is suitable to model over or under-dispersed data sets. If \(DI>1\), the certain distribution is showing over-dispersed behavior. It is observed that the DMOLBE distribution shows over-dispersion when \(\gamma =0.5\) and different values of parameter \(\beta\). Conversely, the DMOLBE distribution shows under-dispersion when \(\beta =0.5\) and different values of \(\gamma .\)

Parameter estimation

Suppose \(x=({x}_{1}, {x}_{2}, {x}_{3}, \dots , {x}_{n} )\) be a random sample of size n from DMOLBE distribution with probability mass function defined as

$$P\left(x\right)=\frac{\gamma \left[\left\{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\}- \left\{\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}\right] }{\left\{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\} \left\{1-\left(1-\gamma \right)\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}}.$$

Then the log-likelihood function is given by:

$$\mathrm{log}L=n\mathrm{log}\left(\upgamma \right)+\sum_{i=1}^{n}\mathrm{log}\left[\left\{\left(1+\frac{{x}_{i}}{\beta }\right){e}^{-\left(\frac{{x}_{i}}{\beta }\right)}\right\}- \left\{\left(1+\frac{{x}_{i}+1}{\beta }\right){e}^{-\left(\frac{{x}_{i}+1}{\beta }\right)}\right\}\right]-\sum_{i=1}^{n}\mathrm{log}\left[1-\left(1-\gamma \right)\left(1+\frac{{x}_{i}}{\beta }\right){e}^{-\left(\frac{{x}_{i}}{\beta }\right)}\right]-\sum_{i=1}^{n}\mathrm{log}\left[1-\left(1-\gamma \right)\left(1+\frac{{x}_{i}+1}{\beta }\right){e}^{-\left(\frac{{x}_{i}+1}{\beta }\right)}\right],$$
(12)

Now partially differentiate w.r.t γ and β, respectively.

$$\frac{\partial L}{\partial \gamma }=\frac{n}{\gamma }-\sum_{i=1}^{n}\frac{\left(1+\frac{{x}_{i}}{\beta }\right){e}^{-\left(\frac{{x}_{i}}{\beta }\right)}}{\left\{1-\left(1-\gamma \right)\left(1+\frac{{x}_{i}}{\beta }\right){e}^{-\left(\frac{{x}_{i}}{\beta }\right)}\right\}}-\sum_{i=1}^{n}\frac{\left(1+\frac{{x}_{i}+1}{\beta }\right){e}^{-\left(\frac{{x}_{i}+1}{\beta }\right)}}{\left\{1-\left(1-\gamma \right)\left(1+\frac{{x}_{i}+1}{\beta }\right){e}^{-\left(\frac{{x}_{i}+1}{\beta }\right)}\right\}},$$
(13)
$$\frac{\partial L}{\partial \beta }=\sum_{i=1}^{n}\frac{ \frac{{{x}_{i}}^{2}}{{\beta }^{2}}{e}^{-\left(\frac{{x}_{i}}{\beta }\right)}- \frac{{\left({x}_{i}+1\right)}^{2}}{{\beta }^{2}} {e}^{-\left(\frac{{x}_{i}+1}{\beta }\right)}}{\left(1+\frac{{x}_{i}}{\beta }\right){e}^{-\left(\frac{{x}_{i}}{\beta }\right)}- \left(1+\frac{{x}_{i}+1}{\beta }\right){e}^{-\left(\frac{{x}_{i}+1}{\beta }\right)} }+\sum_{i=1}^{n}\frac{\left(1-\gamma \right) \left(\frac{{{x}_{i}}^{2}}{{\beta }^{2}}\right){e}^{-\left(\frac{{x}_{i}}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{{x}_{i}}{\beta }\right){e}^{-\left(\frac{{x}_{i}}{\beta }\right)}}+\sum_{i=1}^{n}\frac{\left(1-\gamma \right)\frac{{\left({x}_{i}+1\right)}^{2}}{{\beta }^{2}}{e}^{-\left(\frac{{x}_{i}+1}{\beta }\right)}}{1-\left(1-\gamma \right)\left(1+\frac{{x}_{i}+1}{\beta }\right){e}^{-\left(\frac{{x}_{i}+1}{\beta }\right)}}.$$
(14)

Since it is difficult to find a closed-form solution for the set of nonlinear Eqs. (13, 14) with unknown gamma and beta values, the above-described nonlinear system may be numerically solved using an iterative method like Newton–Raphson by ‘maxLik’ package in R software.

Bayesian estimation

Since random and parameter uncertainty are expressed by a prior joint distribution that was generated before the data was obtained on the failure, the Bayesian approach deals with parameters. The flexibility of the Bayesian technique to incorporate previous knowledge into research makes it particularly useful in the study of reliability, as the lack of data is one of the major problems with reliability analysis. The \(\gamma\) and \(\beta\) parameters of DMOLBED take prior gamma distributions, where \(\gamma\) and \(\beta\) are non-negative values. The α and b parameters as independent joint prior density functions can be expressed as follows:

$$\pi \left(\gamma ,\beta \right)\propto {\gamma }^{{a}_{1}-1}{e}^{-\gamma {b}_{1}}{\beta }^{{a}_{2}-1}{e}^{-\beta {b}_{2}}.$$

The estimates and their variances were equated with the inverse of the Fisher information matrix of alpha and beta to produce the ML estimator for \(\gamma\) and \(\beta\), which was contributed by Dey et al.23. This procedure was used to extract the hyper-parameters of the informative priors. The joint posterior density function of \(\gamma\) and \(\beta\) are derived from likelihood function of DMOLBED and joint prior density:

$$\pi \left(\gamma ,\beta |x\right)\propto {\gamma }^{{a}_{1}}{e}^{-\gamma {b}_{1}}{\beta }^{{a}_{2}-1}{e}^{-\beta {b}_{2}}\frac{\left\{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\}- \left\{\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\} }{\left\{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\} \left\{1-\left(1-\gamma \right)\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}}.$$
(15)

Most Bayesian inference processes have been created using symmetric loss functions. The squared-error loss function is a popular symmetric loss function. The Bayes estimators of \(\gamma\) and \(\beta\), say \(\widetilde{\gamma }\) and \(\widetilde{\beta }\) based on squared error loss function is given by

$$\widetilde{\gamma }={\int }_{0}^{\infty }\gamma {\int }_{0}^{\infty }\pi \left(\gamma ,\beta |x\right) d\beta\,d\gamma , \widetilde{\gamma }={\int }_{0}^{\infty }\beta {\int }_{0}^{\infty }\pi \left(\gamma ,\beta |x\right) d\gamma\,d\beta .$$

See Almetwally et al.22 employed the MCMC technique to solve the above equations.

Two of the most prevalent MCMC methodologies are the Metropolis–Hastings (MH) and Gibbs sampling methods. We employ the MH inside the Gibbs sampling stages:

$$\pi \left(\gamma |\beta ,x\right)\propto {\gamma }^{{a}_{1}}{e}^{-\gamma {b}_{1}}\frac{\left\{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\}- \left\{\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\} }{\left\{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\} \left\{1-\left(1-\gamma \right)\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}},$$
(16)

and

$$\pi \left(\beta |\gamma ,x\right)\propto {\beta }^{{a}_{2}-1}{e}^{-\beta {b}_{2}}\frac{\left\{\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\}- \left\{\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\} }{\left\{1-\left(1-\gamma \right)\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}\right\} \left\{1-\left(1-\gamma \right)\left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}\right\}}.$$
(17)

Results and discussion

In this section, the results from the Monte Carlo simulation and real-life application are discussed in detail. All numerical calculations performed using R language software.

Simulation study

  1. 1.

    The following simulation research is carried out to examine the behaviour of Bayesian and maximum likelihood estimates of the DMOLBE distribution. The simulation research is conducted using the below procedures.

  2. 2.

    Generate \(N=\mathrm{10,000}\) samples of size \(n=50, 100, 150, 200,\) and 300 from DMOLBD.

  3. 3.

    Estimate the parameters \(\widehat{\gamma }\) and \(\widehat{\beta }\) from each generated sample.

  4. 4.

    Compute the absolute biases (AB) and mean square errors (MSE) using the following equations.

For MLE:

$$Bias\left(\gamma \right)=\frac{1}{N}\sum_{i=1}^{N}\left|\widehat{\gamma }-\gamma \right|\,\&\,Bias\left(\beta \right)=\frac{1}{N}\sum_{i=1}^{N}\left|\widehat{\beta }-\beta \right|,$$
$$MSE\left(\gamma \right)=\frac{1}{N}\sum_{i=1}^{N}{\left(\widehat{\gamma }-\gamma \right)}^{2}\,\&\,MSE\left(\beta \right)=\frac{1}{N}\sum_{i=1}^{N}{\left(\widehat{\beta }-\beta \right)}^{2},$$

For Bayesian

$$Bias\left(\gamma \right)=\frac{1}{N}\sum_{i=1}^{N}\left|\widetilde{\gamma }-\gamma \right|\,\&\,Bias\left(\beta \right)=\frac{1}{N}\sum_{i=1}^{N}\left|\widetilde{\beta }-\beta \right|$$
$$MSE\left(\gamma \right)=\frac{1}{N}\sum_{i=1}^{N}{\left(\widetilde{\gamma }-\gamma \right)}^{2}\,\&\,MSE\left(\beta \right)=\frac{1}{N}\sum_{i=1}^{N}{\left(\widetilde{\beta }-\beta \right)}^{2}$$

The simulation results are reported in Tables 2 and 3. Following conclusions are obtained from the results.

Table 2 Simulation results of DMOLBE distribution for different parameter values by MLE estimation.
Table 3 Simulation results of DMOLBE distribution for different parameter values by Bayesian estimation.

The following points were concluded from the simulation results

  1. 1.

    The estimated bias always decreases and approaches zero when \(n\to \infty\) for all combinations of parameters.

  2. 2.

    The estimated MSE decrease with an increase in sample size.

  3. 3.

    Bayesian estimation is better than MLE.

Applications

This section is based on the advantage of the newly proposed DMOLBE distribution over some commonly used distributions. The performance of the DMOLBED is compared with competitive distributions. The competitive distributions are discrete Burr XII distribution (DBXII), discrete Bilal distribution (DB), discrete Burr–Hatke distribution (DBH), discrete exponentiated Rayleigh distribution (DER), discrete length biased exponential distribution (DLBE), discrete Pareto distribution (DPr), and discrete Poisson distribution (DP). The probability mass functions of these distributions are;

Discrete Burr XII distribution

$$P\left(X=x\right)={\beta }^{\mathrm{ln}\left(1+{x}^{\gamma }\right)} - {\beta }^{\mathrm{ln}\left(1+{\left(1+x\right)}^{\gamma }\right)}$$

Discrete exponentiated Rayleigh distribution

$$P\left(X=x\right)={\beta }^{\mathrm{ln}\left(1+{x}^{\gamma }\right)} - {\beta }^{\mathrm{ln}\left(1+{\left(1+x\right)}^{\gamma }\right)}$$

Discrete Pareto distribution

$$P\left(X=x\right)=\mathrm{exp}\left(-\beta \mathrm{log}\left(1+x\right)\right)-\mathrm{exp}\left(-\beta \mathrm{log}\left(2+x\right)\right)$$

Discrete length Biased exponentiated distribution

$$P\left(X=x\right)=\left(1+\frac{x}{\beta }\right){e}^{-\left(\frac{x}{\beta }\right)}- \left(1+\frac{x+1}{\beta }\right){e}^{-\left(\frac{x+1}{\beta }\right)}$$

Discrete bilal distribution

$$P\left(X=x\right)=2\left({\beta }^{3}-1\right){\beta }^{3x}-3\left({\beta }^{2}-1\right){\beta }^{2x}$$

Discrete Burr–Hatke distribution

$$P\left(X=x\right)=\left(\frac{1}{x+1}-\frac{\beta }{x+2}\right){\beta }^{x}$$

Discrete Poisson distribution

$$P\left(X=x\right)=\frac{{e}^{-\beta }{\beta }^{x}}{x!}$$

The model parameters of considered models are estimated using the maximum likelihood method. The performance of all fitted distributions is compared utilizing some criteria, Akaike information criterion (AIC), Bayesian information criterion (BIC), and Kolmogorov–Smirnov (K–S) test with its corresponding p values. All the computations are carried out in R software.

Data Set I (death due to coronavirus in China)

The first data set is the number of deaths due to coronavirus in China from 23 January to 28 March. The data sets used in the paper was collected from 2020 year. The data set is reported in https://www.worldometers.info/coronavirus/country/china/. The data are: 8, 16, 15, 24, 26, 26, 38, 43, 46, 45, 57, 64, 65, 73, 73, 86, 89, 97, 108, 97, 146, 121, 143, 142, 105, 98, 136, 114, 118, 109, 97, 150, 71, 52, 29, 44, 47, 35, 42, 31, 38, 31, 30, 28, 27, 22, 17, 22, 11, 7, 13, 10, 14, 13, 11, 8, 3, 7, 6, 9, 7, 4, 6, 5, 3 and 5. The MLEs with their corresponding standard errors and goodness-of-fit measures are presented in Table 4.

Table 4 Parameter estimation and goodness-of-fir measures for first data.

Table 4 presents the results for estimated parameters using different models for the first data set which shows that DMOLBE distribution better fits the data set as compared to other competitive models as AIC and BIC are smaller for the proposed model. Table 5 discussed comparing between MLE and Bayesian estimation by SE for the death due to coronavirus in China. By results in Table 5, we conclude that the Bayesian estimation is best estimation method for the death due to coronavirus in China. Figure 6 shows the cdf of different distributions of the first data set and Fig. 7 presents the P–P plots for all the competitive models, both figure supports the results obtained in Table 4. Figure 8 show that estimates of DMOLBED parameters for the death due to coronavirus in China data is existence and has the maximum log-likelihood value. Figure 9 plot MCMC plot results of parameter estimates of DMOLBED for the death due to coronavirus in China data to confirm the estimates have convergence and the posterior has normal distribution as proposed distribution.

Table 5 MLE and Bayesian estimation of DMOLBED parameters for the death due to coronavirus in China.
Figure 6
figure 6

The estimated CDFs for the death due to coronavirus in China.

Figure 7
figure 7

The P–P plots for the death due to coronavirus in China.

Figure 8
figure 8

Existence for the log-likelihood for the death due to coronavirus in China.

Figure 9
figure 9

MCMC plots of convergence for parameter estimates of DMOLBED for the death due to coronavirus in China.

Data Set II (daily death due to coronavirus in Pakistan)

The second data set is the daily deaths due to coronavirus in Pakistan from 18 March to 30 June. The data sets used in the paper was collected from 2020 year. The data is reported in https://www.worldometers.info/coronavirus/country/Pakistan. The data are: 1, 6, 6, 4, 4, 4, 1, 20, 5, 2, 3, 15, 17, 7, 8, 25, 8, 25, 11, 25, 16, 16, 12, 11, 20, 31, 42, 32, 23, 17, 19, 38, 50, 21, 14, 37, 23, 47, 31, 24, 9, 64, 39, 30, 36, 46, 32, 50, 34, 32, 34, 30, 28, 35, 57, 78, 88, 60, 78, 67, 82, 68, 97, 67, 65, 105, 83, 101, 107, 88, 178, 110, 136, 118, 136, 153, 119, 89, 105, 60, 148, 59, 73, 83, 49, 137 and 91.

Table 6 presents the results for estimated parameters using different models of the second data set which shows that DMOLBE distribution better fits the data set as compared to other competitive models as AIC and BIC are smaller for the proposed model. Table 7 discussed comparing between MLE and Bayesian estimation by SE. By results in Table 7, we conclude that the Bayesian estimation is best estimation method. Figure 10 shows the cdf of different distributions of the second data set and Fig. 11 presents the P–P plots for all the competitive models, both figure supports the results obtained in Table 6. Figure 12 show that estimates of DMOLBED parameters for Coronavirus in Pakistan data is existence and has the maximum log-likelihood value. Figure 13 plot MCMC plot results of parameter estimates of DMOLBED for Coronavirus in Pakistan data to confirm the estimates have convergence and the posterior has normal distribution as proposed distribution.

Table 6 Parameter estimation and goodness-of-fir measures for second data.
Table 7 MLE and Bayesian estimation of DMOLBED parameters for Coronavirus in Pakistan data.
Figure 10
figure 10

The estimated CDFs for the death due to coronavirus in Pakistan.

Figure 11
figure 11

The P–P plots for the death due to coronavirus in Pakistan.

Figure 12
figure 12

Existence for the log-likelihood of DMOLBED parameters for Coronavirus in Pakistan data.

Figure 13
figure 13

MCMC plots of convergence for parameter estimates of DMOLBED parameters for Coronavirus in Pakistan data.

Conclusion

The DMOLBE distribution, a novel two-parameter discrete probability distribution that may be utilised in place of well-known distributions, is introduced in this study. Its mathematical characteristics are provided in some cases. The maximum likelihood and Bayesian estimation methods are used to estimate the distribution's parameters. The MCMC method is applied by the MH algorithm to produce the Bayesian estimation method. To evaluate the performance of unidentified parameters based on AB and MSE, simulation research is conducted. MLE and Bayesian estimate methods for the performance parameter of the DMOLBE distribution were compared through simulation. We came to the conclusion that the Bayesian estimation approach is superior for estimating DMOLBE distribution parameter. The flexibility of the model is proved by using two real data sets and is compared with different existing models and the proposed model perform better among other models. Further the estimation of the proposed model can be performed using transforms. We will make future work as extension for this study, we will make a regression analysis to predict the future mortality rates in many countries under considerations.

Future work

Future work in statistical analysis for COVID-19 data holds great potential in advancing our understanding of the pandemic and informing evidence-based decision-making. One key area of focus is the integration of more comprehensive and diverse datasets, including demographic, socioeconomic, and healthcare variables, to explore the multifaceted aspects of COVID-19's impact on different populations. Advanced machine learning techniques can be applied to identify complex relationships and risk factors associated with the spread, severity, and outcomes of the virus. Furthermore, predictive modeling can be enhanced by incorporating real-time data streams and dynamic factors to provide more accurate and timely forecasts, aiding in proactive planning and resource allocation. Longitudinal studies analyzing the long-term effects of the pandemic and assessing the efficacy of interventions over time will provide valuable insights into the sustainability of public health measures. Additionally, ethical considerations and privacy-preserving methodologies should be integrated into future analyses to ensure data security and protect individuals' rights. Overall, future work in statistical analysis for COVID-19 data will continue to play a pivotal role in guiding public health policies, bolstering preparedness for future outbreaks, and ultimately safeguarding global health.