On parameter estimation of the standard omega distribution

The standard omega distribution is defined on the unit interval so that it is a probabilistic model for observations in rates and percentages. It is, in fact, the unit form of the exponentiated half logistic distribution. In this work, we first give a detailed shape analysis from which we observe that it is another flexible beta-like distribution. We observe that it can be J-shaped, reverse J-shaped, U-shaped, unimodal and show left and right skewness according to the values of its shape parameters. Contrary to the ordinary beta, it has the advantage of having a clear distribution function. We then discuss the existence and uniqueness of the maximum likelihood estimators and the Bayesian estimate of the parameters. The existence and uniqueness of the maximum likelihood estimators of the parameters will give a great advantage to the possible practitioners of this model since the possibility of finding a spurious solution to the likelihood equations disappears then. The comparison of these estimators with the existing ones for the general omega distribution is made with the help of a simulation study. Two real data fitting demonstrations prove its usefulness among other beta-like distributions such as Kumaraswamy, log-Lindley and Topp–Leone.


Introduction
In many practical situations, we often encounter proportion data.A random variable whose values are expressed as percentages or fractions of a whole produces proportion data.For example, acceptance rate, i.e. the percentage of applicants admitted for a college or university in a country, percentage of votes in an election, width to length ratio of a rectangular manufactured item, cost-to-charge ratio, i.e. ratio of the actual cost of care to what the hospital actually bills for care, yield of a chemical reaction taken at a specific temperature, percentage of the useful volume of a hydroelectric plant's water reservoir, and so on.
Such random variables can take any value on the unit interval and a suitable probability distribution like beta is needed for modelling the observed data.
There are many distributions for modelling data which are restricted to a bounded range.Recently, Dombi et al. [7] introduced a new probability distribution referred to as the omega distribution and discussed its application in reliability theory.It has the following probability density function (PDF) and the cumulative distribution function (CDF) respectively, where α > 0, β > 0, d > 0, and is the omega function.The main properties of the omega function can be found in Dombi and Jónás [5].These formulas are simple in the sense that they do not contain any special function.They also do not contain an exponential term.These features give the distribution an important advantage in computational procedures like parameter estimation.The parameters α and β have similar meanings to those of the Weibull distribution.The parameter d clearly determines the domain of the distribution.The hazard rate function of the distribution can exhibit monotonic, constant and bathtub shapes.Dombi et al. [7] showed that the asymptotic omega hazard rate function is just the Weibull hazard rate function.That is, the asymptotic omega distribution is just the Weibull distribution.As is also noted in Dombi and Jónás [6] that when d → ∞, the log-likelihood function of the omega distribution is identical to the log-likelihood function of the Weibull distribution.
Later, Okorie and Nadarajah [19] derived closed-form expressions for the mean, mode, median, variance, rth raw moment, quantile function, skewness and kurtosis measures of the distribution.
Vasileva [25] investigated the omega probability distribution in the Hausdorff sense and derived a nonlinear equation satisfied by the Hausdorff distance between the CDF of the distribution and the Heaviside function.
Moments of order statistics from the distribution were studied in Alsubie et al. [1].They also discuss seven methods to estimate the parameters of the distribution, including maximum likelihood (ML).Recently, Özbilen and Genç [20] defined a bivariate extension of the standard form of the distribution via Marshall-Olkin approach.
In this work, we will also concentrate on the standard omega distribution by taking d = 1.We will suggest this form to model rates and percentage data appeared in many fields, e.g.economics, biology, etc.This paper will replace it among the alternatives to the ordinary beta distribution.By doing that, we open the research for alternative models such as beta regression models, beta-binomial models, etc.
We will estimate the parameters by the ML method.Although several estimation methods, including the ML for the parameters of the distribution are proposed by Alsubie et al. [1], the existence and uniqueness problem of the MLEs of the parameters are not studied in the literature.This problem of the ML estimators (MLEs) is important in statistical inference.An ML algorithm may fail for a distribution due to the presence of several local maxima.This may make the solution of the ML estimating equations wrong.Another problem is to obtain the solution on the boundary of the parameter space.There are a lot of works in the literature that aim to show the existence and uniqueness of the MLEs of the parameters of different models.Some of the recent studies include Ghitany et al. [8], Gui [11], Jiang and Gui [15], Sudsawat and Pal [23].
In the general case, we note that the survival function of the distribution is in the form S(x; α, β) = [K(x; β)] α , where is the parent survival function.That is, the omega distribution belongs to the class of proportional hazard rate models (see, e.g.[21]) which includes several well-known lifetime models such as exponential, Pareto, Lomax and Burr Type XII.
The paper is organized as follows.In Section 2, the standard omega distribution and its shape analysis are given.In Section 3, the ML estimation is discussed and given the observed Fisher information matrix.In Section 4, the Bayesian estimation is discussed.In Section 5, a simulation study comparing the proposed methods with the existing ones is performed.In Section 6, two real data fitting applications are given.The paper is finalized with conclusions.In order to shorten the paper, the proofs of the lemmas and theorems are given in the supplementary file, available online.

Standard omega distribution
When d = 1, the standard omega distribution is obtained.It has the following PDF: and the CDF It is flexible with bounded domain distribution, like the ordinary beta.It can be different shapes such as J-shaped, reversed J-shaped, U-shaped, unimodal, skewed left and skewed right (Figure 1).However, it has the advantage that its CDF is in a simple mathematical form not containing any special function.Note that the standard omega distribution is, in fact, a unit exponentiated half logistic distribution.That is, the distribution of Z = e −X when X has the exponentiated half logistic distribution.The latter is another useful probability distribution and has many applications in reliability theory (see, e.g.[14,22]).We also note that some of the well-known distributions in the literature are also obtained by this transformation.For example, the Kumaraswamy distribution [16] is obtained from the generalized exponential distribution [12] and log-Lindley distribution [9] is obtained from the generalized Lindley distribution [26] by this transformation.
Henceforward, a random variable following the PDF in (1) will be denoted by SO(α, β).

Shape of the PDF
In this section, we will revise the shape of the distribution for its standard form.We immediately have the following limit: and by L'Hôpital's rule, we have the following limit lim As is also noted in [19] for omega distribution that if (αβ) 2 − 4(β 2 − 1) > 0, then the roots of the equation d dx ln f (x; α, β) = 0 are given by 1/β and From the analyse above, we can summarize the shape of the PDF as in the following: • β ≤ 1 and α < 2 correspond to a U-shaped distribution with minimum at x 2 .
The shape of the PDF can be seen in Figure 1 for several choices of the parameters.We observe that the PDF can be as flexible as that of the ordinary beta distribution.Thus, one may properly use the standard omega distribution alternatively to the beta distribution in data modelling.
The SO distribution is closed under power transformation as it is shown in the following theorem.

Maximum likelihood estimation
In this section, we will discuss the ML estimation of the model parameters.We will first derive the MLEs of the parameters and then show the existence and uniqueness of the MLEs.
Let x 1 , x 2 , . . ., x n be a data set modelled by the SO(α, β) distribution.Then, we have 0 < x i < 1, i = 1, 2, . . ., n and the log-likelihood function is given by The log-likelihood estimating equations are given by and The MLEs of α and β are obtained by solving Equations ( 2) and (3), simultaneously.From Equation ( 2), the MLE of α is given explicitly as a function of β.

Theorem 3.6:
We have that Note that the limit n i=1 ln x i − n ln x (n) is negative.By Theorem 3.3 and Theorem 3.6, we can say that the function G(β) has at least one root in (0, ∞).This means that β, the MLE of β, exists in the parameter space.Now, we will try to show that β is unique. Let Then, after some arrangements, we have and

Lemma 3.7:
The sum S 1 is positive.
Lemma 3.9: The sum S 2 is positive.

Observed fisher information matrix
Let θ = (α, β) .We give the second order partial derivatives of the log-likelihood function l to form the observed Fisher information matrix whose elements l θ i θ j = ∂ 2 l/∂θ i ∂θ j are given by From the standard large sample that Approximate confidence intervals for parameters can be found as α ± z γ /2 [ var( α)] 1/2 and β ± z γ /2 [ var( β)] 1/2 where z γ /2 is the quantile of order 1 − γ /2 of the standard normal distribution and var is the respective diagonal element of I −1 1 ( θ ).

Bayesian estimation
Let x = (x 1 , x 2 , . . ., x n ) be the vector of observations.In this section, we will discuss Bayesian estimations (see, e.g.[13]) of the parameters.Since both parameters are positive, we may suppose the independent gamma priors have the following PDF's: Then the joint posterior distribution of (α, β) is ignoring the normalizing constant.The conditional posterior densities of α and β are given by , respectively.Under the squared error loss function, the Bayes estimators of α and β are given by dα dβ, respectively.Here, C is the normalizing constant ignored in (5).Since these integrals cannot be solved analytically, we resort to the numerical methods, like the Metropolis-Hastingswithin-Gibbs algorithm (see, e.g.[17]).The algorithm adapted for the standard omega distribution is given in the following: Step 1: Set α (0) = α, β (0) = β and t = 1.

Simulation study
To examine the performance of the Bayesian estimator and to compare it with the previously proposed estimators, we generate 10,000 samples from SO(α, β) for each combination of different parameter values and sample sizes.Here, we choose parameters as α = 0.75, 1.5, 2.5, β = 0.75, 1.5, 2.5 and sample sizes as n = 30, 100.The values of the parameters have been chosen so that we examine the estimation methods under different density shapes.Afterwards, ML, Bayes (B), least square (LS), weighted least squares (WLS), maximum product spacing (MPS), percentile (P), Anderson-Darling (AD), and right-tailed Anderson-Darling (RAD) estimation methods were applied to the simulated data.For the Bayesian estimation, we used the procedure given in Section 4 where gamma priors with b 1 = b 2 = 1 and choose different a 1 and a 2 values for different density shapes as in Table 1.Here, a 1 and a 2 values were obtained by considering the shape analysis given in Section 2.1.For the details about these estimation methods that are not given in this paper, see Alsubie et al. [1].They also share a simulation result for these methods, but they only use parameter values that generate reversed J or unimodal shaped densities.We have demonstrated the values of the means and mean squared errors (MSEs) of all the estimators in Tables 2 and 3.
According to Tables 2 and 3, we can say that all estimators are consistent since the MSEs tend to zero for increasing n.According to the MSE criteria, Bayes and MPS estimators give similar results, and they give the best results among all estimators.Generally, the Bayes estimator is better at the estimation of β, and the MPS estimator is better at the estimation of α.According to mean estimations, the best results were obtained by LS and P estimators.Here, the Bayes estimator always overestimates the actual parameter values, but it produces  better results than the ML estimator, especially for a small value of n.As a note, we observe that the RAD estimator produces poor results when the parameter values produce a Jshaped distribution.

Illustrative examples
To see the performances of the proposed ML and Bayes estimators, we consider two real data sets.We do not only fit the SO distribution to the data sets, but also other beta-like distributions for comparison.
We fit the SO distribution as well as the ordinary beta and some other beta-like distributions [9,16,24] to these data and calculate the ML estimators first.The estimation results are given in Table 4.For the SO distribution, it can also be seen the location of the MLE of β in the plot of the G(β) function (Figure 2(left)).According to the AIC (Akaike Information Criterion), the OS method gives the best fit among others.We also perform a nonparametric bootstrap Kolmogorov-Smirnov (KS) goodness-of-fit test based on 10,000 replications for the proposed method.It gives the result KS = 0.1846 with p-value 0.7141.This supports the OS fit to the data.
The 90% asymptotically confidence intervals for parameters α and β are also given, respectively, by (5.982015, 12.50404) and (0.996819, 1.364413).Their lengths are given by 6.522028 and 0.3675939.On the other hand, we run the Metropolis-Hastings-within-Gibbs algorithm by taking M = 10,000, H = 1000, a 1 = a 2 = b 1 = b 2 = 0 and obtain the Bayes estimates of the parameters as αB = 9.29683 and βB = 1.17617.The 90% highest posterior density (HPD) credible intervals of the parameters α and β are given, respectively, by (6.426737, 12.64526) and (1.001534, 1.354367).Their lengths are given by 6.218522 and 0.352833.We observe that these intervals are narrower than those for the ML method.

Consumer price index data
In economics, there are many indicators defined in interval (0, 1).For instance, poverty rate, unemployment rate, labour force participation rate, gross domestic product, etc.One of the most important indicators for the economics of a country is the inflation rate.We next consider the consumer price index (CPI) data set of 43 countries reported by OECD [18] on web page https://data.oecd.org/price/inflation-cpi.htm\#indicator-chart.Inflation measured by CPI is defined as the change in the prices of a basket of goods and services that are typically purchased by specific groups of households.The data are given in Table 5 for convenience.The data set consists of monthly indices and contains the latest data available by March 2022.The skewness and kurtosis measures of the data are given, respectively, by 3.7575 and 16.6902.Thus, the data set is positively skewed and have a high kurtosis.A suitable model defined on (0, 1) is needed for modelling this high kurtosis and positive skewness.
In this case, we just fit three candidate models to the data set, i.e. the SO, beta and Kumaraswamy distributions.The estimation results are given in Table 6.For the SO distribution, it can also be seen the location of the MLE of β in the plot of the G(β) function (Figure 2(right)).According to the AIC value, the OS method gives the best fit among others.We also perform a nonparametric bootstrap Kolmogorov-Smirnov goodness-of-fit test based on 10,000 replications for the proposed method.It gives the result KS = 0.2338 with p-value 0.7466.This supports the OS fit to the data.
The 90% asymptotically confidence intervals for parameters α and β are (7.3596,18.7459) and (0.9370, 1.3245), respectively.Their lengths are given by 11.3863 and 0.3875.The Bayes estimates of the parameters are given by αB = 12.8404 and βB = 1.1149.The 90% highest posterior density (HPD) credible intervals of the parameters α and β are given, respectively, by (8.0155, 19.0789) and (0.9285, 1.3065).Their lengths are given by 11.0635 and 0.3780.We observe that these intervals are slightly narrower than those for the ML method.

Conclusions
We showed that the SO distribution may have many different forms and concluded that it is another flexible beta-like distribution.It has the advantage of having a clear and simple CDF.The ML estimates of the two parameters uniquely exist in the parameter space and this is another advantage of the distribution for the possible practitioners of the distribution.Alternatively, the SO is also amenable under MCMC sampling methods to get Bayes estimates of the parameters.A simulation study that covers both small and large sample sizes was performed for comparison of the estimation methods proposed here with the existing ones.
We showed in the examples that percentage data sets with high skewness and kurtosis can be modelled best by the SO distribution among other beta-type ones.We also observed in many applications that when the kurtosis becomes smaller, the SO distribution performs a little bit better than the Kumaraswamy distribution according to the AIC criteria.The real data demonstrations prove its usefulness in modelling rates and percentages.
From the reliability point of view, the SO model may be a useful alternative.The existence and uniqueness of the ML estimates will be studied under different censoring schemes of a sample from the distribution as further research.

Figure 1 .
Figure 1.Plots of the PDF of the standard omega distribution.

Figure 2 .
Figure 2. Plot of the function G(β) and location of the MLE of β for the failure times data (left) and CPI data (right).

Table 1 .
Actual parameter values and the corresponding density shapes and gamma priors.

Table 2 .
Means of estimates of the parameters and MSEs (in parentheses) for many estimation methods when n = 30.

Table 3 .
Means of estimates of the parameters and MSEs (in parentheses) for many estimation methods when n = 100.

Table 4 .
MLE's and AIC values for failure time data set.

Table 6 .
MLE's and AIC values for CPI data set.