Fitting probability distributions to economic growth: a maximum likelihood approach

ABSTRACT The growth rate of the gross domestic product (GDP) usually carries heteroscedasticity, asymmetry and fat-tails. In this study three important and significantly heteroscedastic GDP series are examined. A Normal, normal-mixture, normal-asymmetric Laplace distribution and a Student's t-Asymmetric Laplace (TAL) distribution mixture are considered for distributional fit comparison of GDP growth series after removing heteroscedasticity. The parameters of the distributions have been estimated using maximum likelihood method. Based on the results of different accuracy measures, goodness-of-fit tests and plots, we find out that in the case of asymmetric, heteroscedastic and highly leptokurtic data the TAL-distribution fits better than the alternatives. In the case of asymmetric, heteroscedastic but less leptokurtic data the NM fit is superior. Furthermore, a simulation study has been carried out to obtain standard errors for the estimated parameters. The results of this study might be used in e.g. density forecasting of GDP growth series or to compare different economies.


Introduction
The gross domestic product (GDP), the market value of officially recognized goods and services that a country produced within a specific time period, has historically been considered as a measure of economic growth.
Economic growth does not follow any smooth patterns over the time, rather it keeps fluctuating in short and the long run. GDP growth rates contain fat-tails (large kurtosis) and heteroscedasticity (see, e.g. [1,[4][5][6]12,33]). Furthermore, GDP growth series are typically asymmetric, which is both expected and empirically confirmed (see, e.g. [3,17,41]). The general conclusion from the above studies is that growth rate of GDP growth is heteroscedastic, asymmetric and leptokurtic.
The growth rate of GDP indicates at which pace a country's economy is growing. Thus, accurate density distributions are required to forecast economic growth known as density forecast. Fitting the correct density distribution for GDP growth is the primary objective of this paper. Density forecasting is rapidly getting more attention in the fields of financial time series and economics (see, e.g. [9,53]). Heteroscedasticity affects the estimates of parameters. In order to find the correct density distribution it is important to filter the data for heteroscedasticity. For this purpose, we use the filter proposed by Stockhammar and Öller [52]. After the filtering, the series becomes homoscedastic, but asymmetry and fat tails still remain.
The normal-mixture (NM) distribution is widely used in empirical finance and has a long history of its application in various fields, which include astronomy, biology, economics, finance and engineering. The NM distribution has multiple uses in various fields, whose details can be seen in [11,38,40,49,55]. The NM distribution can capture the leptokurtic, asymmetric and multimodal characteristics of any time series data. Gridgeman [15] proved that when the regimes had the same mean, NM would be leptokurtic. The NM distribution has a long history in the modelling of asset returns (see, e.g. [2,7,26,45,46]). The finding that skewness and leptokurtosis can be introduced by setting different parameter values that were used in the early nineteenth century by e.g. Pearson [44] and Newcomb [43]. Kamaruzzaman et al. [23] used two components in the NM distribution for financial time series, and found that the NM distribution captures the leptokurtic as well as skewness in the data. So the NM distribution could be used to model the growth data.
The asymmetric Laplace (AL) distributions have been applied in analyses of currency exchange rates, stock price changes, interest rates, daily financial market series, economics and marketing data, etc. (see [29][30][31]34]). We observed, for the filtered (and unfiltered) growth series, the excess kurtosis in AL models is too large. Stockhammar and Öller [51] added Gaussian noise to the AL-distribution and introduced the Normal-AL (NAL) distribution. It was partly based on a Schumpeterian theory of economic growth. The NAL distribution is able to capture a wide range of kurtosis and asymmetry.
Here Student's t distributed noise is added to the AL-distribution to account for the excess kurtosis of AL. The AL distribution is combined with Student's t distribution to form the weighted mixed Student's t-AL (TAL) distribution. The TAL distribution can be used for generating a wide range of skewness and kurtosis, which makes the model very flexible.
A mixture distribution is suitable for data that are divided into natural groups. Introduction to mixture distributions, as well as further detail on the theory, parameter estimation methods and applications can be found in [11,14,35,[38][39][40]49,55]. The Mixture distribution parameters are estimated using the maximum likelihood (ML) method. The rest of the paper is designed as follows. Section 2 and Section 3 deal with data analysis and preparation. In Section 4 a model discussion is given along with the proposed model. Section 5 holds the estimation set-up with maximum likelihood estimates (MLE) and a distributional accuracy comparison. Section 6 concludes the paper.

The data
Quarterly and seasonally adjusted GDP series of three countries US (1947-2012), UK (1955-2012) and CA (Canada)  are studied in this article. The data have been taken from the websites of Bureau of Economic Analysis (www.bea.gov), UK National Statistics (www.statistics.gov.uk) and of Statistics Canada (www.statcan.gc.ca), respectively. Long series are required for accurate estimation of the N, NM, NAL and TAL parameters. The above series are the most important and longest quarterly GDP series available. The first logarithmic differences of series and their frequency distributions are presented in Figure 1. Moreover, an estimation of a Kernel density 1 and N distribution with mean and variance of the first logarithmic differences of series are shown.
The first difference of the logarithmic (Diff log) GDP series appears to be leptokurtic. This is also confirmed in Table 1. The excess kurtosis exceeds zero. The results of ARCH-LM test for heteroscedasticity and Augmented Dickey Fuller test for stationarity are also presented in Table 1.
The skewness seems to be non-zero in the UK and CA series. High kurtosis appears in all series as excess kurtosis in all cases exceeds zero. The ARCH-LM test rejected the null hypothesis of homoscedasticity in all series with a p-value of .00. The consequence of heteroscedasticity is that observations are weighted unequally, and thus the parameter estimates become inefficient. The Augmented Dickey-Fuller test is also rejecting the null hypothesis for a unit root in the Diff log GDP series with p-value of .00. Heteroscedasticity affects the estimates of parameters and most time series models require stationarity. The heteroscedasticity must be removed to compare the distributions of data. In order to make a fair comparison between the frequency distributions and various probability distribution of the three series, the filter proposed by Stockhammar and Öller [52] is used. This filter enables us to work with mean and variance stationary time series.

Data preparation
Hodrick and Prescott [18] (HP) filter is a popular tool which decomposes a given macroeconomic time series into a non-stationary growth component and a stationary cyclical component. Let x t be a seasonally adjusted time series, and let the decomposition of x t into an unobserved trend component, g t , and an unobserved cyclical component, c t , at time t be The HP filter is defined as the trend component that solves the following minimization problem: 2 and g min is the HP filter. The accuracy of the estimation accounts for the first sum of (1), while the smoothness of the trend represents the second sum. The second sum, ( 2 g t ), is the square of the trend components, g t second differences at time t. The smoothness parameter γ which penalizes the variability in the growth component series is a positive number. The larger the value of γ the smoother is the solution series and vice versa. Hodrick and Prescott recommended a value of γ = 1600 for quarterly data. Stockhammar and Öller [52] proposed a new filter for removing the heteroscedasticity from the data by the use of the HP filter. They used the HP filter in order to make smooth moving standard deviations. The same method has been used in this study. Letz t be the filtered series where t = max[k − η, l − ν], max[k − η + 1, l − ν + 1], . . . with k and l odd numbers as the window lengths in the numerator and denominator, respectively, and η = (k − 1)/2, ν = (l − 1)/2. and i = a,b indexes the two detrending operations Note that for η = (n − 1)/2, the term t+η τ =t−η y τ /k equals y. and with y τ delayed one period In case when k = 1 then η = 0, operation (2b) is used. y τ −1 is equivalent to second-order difference operation 2 y t where y t = y t − y t−1 , y t is the logarithmic series at time t. The transformations in (2) are generalized by raising z (i) t to the power of d, which is not necessarily an integer. The best choice of η depends on the properties of the series studied. Stockhammar and Öller [51] have used k = l = 15, d = 1 and η = 1600, for the UK, US and G7 GDP series. The same filter settings is here used for the all GDP series.
Heteroscedasticity is removed from the data in the US and UK filtered series, whereas in the CA series the null hypothesis is accepted at α = 0.05. The US series is more leptokurtic as compared to the other series. All the filtered GDP series are negatively skewed , see Table 2. Figure 2 shows the Diff log US, UK and CA series after the heteroscedasticity filtering (2). Table 3 shows that the mean (μ) and the standard deviation (σ ) are stable in filtered series. Except for skewness in the US series, the estimates of skewness (τ 1 ) and excess kurtosis (k 1 ) are more stable in filtered series. According to Stockhammar and Öller [52], the proposed filter preserves the dynamics of the time series without distorting white noise.
The unfiltered series in Figure 1 do not seem to be normal. Table 4 shows that the filter brings them closer to normality.
According to e.g. Dyer [10], Thadewald and Büning [54] and Razali and Wah [47] the power of normality tests is generally low, especially in small samples. Note that the χ 2 , AD and CVM statistics for the CA series reject the null hypotheses of normality. At least for the CA series it seems meaningful to study if other distributions are there that better fit the data. Considering the low power of the tests, we will try the same for the US and the UK series. The N distribution remains an alternative hypothesis.    y t ,US *** *** *** *** *** *** *** z t ,US y t ,UK *** *** *** *** *** *** *** z t ,UK y t ,CA *** *** *** *** *** *** *** z t ,CA * ** * Notes: *, ** and *** represent significance at the 10 %, 5 % and 1% levels, respectively, for the null hypothesis of normality. Seven widely used normality tests are reported, where AD, SW, KS, JB, χ 2 , CVM and SF are the Anderson-Darling, Shapiro-Wilk, Kolmogorov-Smirnov and Jarque-Bera, Pearson chi-square, Cramer-Von Mises and Shapiro-Frania test, respectively. Different measures are considered in these tests, and can therefore lead to different conclusions.

Models for GDP growth
A mixture distribution is a probability density function (PDF) of the form Here, K is the number of components in the mixture distribution and λ k is the mixing A non-negligible risk is involved when the distribution changes over time in long time series. The data might have passed through a number of different regimes, not completely eliminated by filter (2). Every such regime can follow a different distribution. The filtered US and UK GDP show a small hump in the right tail while the filtered CA shows it in the left tail in Figure 2, which may indicate that the data are characterized by at least two regimes. Given the relatively few observations, the numbers of possible regimes we take into account are here restricted to two. Moreover, the homoscedasticity test was unable to detect non-constancy of variances which makes it hard to detect regimes with different variances.
The PDF of the NM distribution is given in Appendix 1. It is possible to introduce excess kurtosis and asymmetry in the NM distribution by introducing different means and variances for the regimes. In empirical finance, NM distributions are widely used. Wirjanto and Xu [56] provided a selected review of recent developments and applications of the NM distribution in empirical finance. The NM distribution is able to capture the leptokurtic, skewed and multimodal characteristics, and is flexible enough to accommodate various forms of continuous distribution in time series data. Kamaruzzaman, et al. [23] found that the NM distribution captured the leptokurtocy as well as skewness in the data, and they proposed a two component NM distribution for financial time series. The NM distribution is suitable to accommodate certain discontinuities in shock returns such as weekend effect, the turn-of the month effect and the January effect, see Klar and Meintanis [25].
The Laplace (L) distribution is also called the double exponential distribution. The L distribution is the distribution of differences between two independent variates with identical exponential distributions. The PDF of the L distribution is defined in Appendix 1. The L distribution has been used in many fields like electronics, engineering, finance, etc. [27]. The L distribution is symmetric around its mean (μ) with variance = 2φ 2 and excess kurtosisk = 3. The L distribution has fatter tails compared to the normal distribution. It is, however, hard to find a clear shape parameter which makes it rather inflexible. Also, the excess kurtosis is restricted to the constant value (3), no matter what the kurtosis in the data. Table 2 shows that the excess kurtosis of the L distribution is too large for the filtered growth series in this study (k = 0.262 for the US,k = 0.046 for the UK andk = 0.000 for the CA). Clearly, the data cannot be explained by L distribution alone. It is, however, possible to modify the L distribution by allowing it to have a second stochastic component. This means that its empirical counterpart is buried in Gaussian noise. We therefore combine L with normal distribution with a weight parameter w. This mixture was introduced by Kanji [24] to model wind shear data.
The PDF of the Normal-Laplace (NL) mixture distribution is specified in Appendix 1. In NL distribution the N and L distributions carry the same mean. Jones and McLachlan [21] generalized NL distribution and demonstrated that this may lead to an even better fit. Hass et al. [16] used an NL mixture in modeling and predicting financial risk based on 25 daily stock return series. The characteristics of the NL density are shown in Figure 3.
The L and NL mixture distributions in Figure 3 do not account for potential skewness in the data. McGill [37] has proposed a suitable skewed generalization of the L distribution. He considers an asymmetric Laplace (AL 1 ) distribution. The PDF of the AL 1 distribution is given in Appendix 1. The ML estimate of μ in AL 1 is the median. For ψ > φ, this distribution is negatively skewed and vice versa for φ < ψ. The L distribution is a special case of AL 1 when φ = ψ. In AL 1 , ψ is the parameter of shocks weaker than the median and φ that of stronger shocks than the median.
In the last few decades, various forms and applications of AL distributions can be traced in the literature [29]. Kozubowski and Podgorski [30,31] used the AL distribution for modelling interest rates and currency exchange rates. Linden [34] demonstrated highly significant ψ and φ using the AL 1 distribution to model the return of 20 stocks. A threeparameter AL distribution was fitted to flood data by Yu and Zhang [58]. Jayakumar and Kuttykrishnan [20] developed autoregressive models with the AL distribution to apply it on time series data. Julia and Vives-Rego [22] used the AL distribution in the field of microbiology to fit flow cytometric scatter data. Kozubowski and Nadarajah [29] reviewed 16 known variations in the Laplace distribution. They provided the basic mathematical properties, including its moment and ML estimator for each particular case, and discussed the area of application with references.
An advantage of the AL distribution is that it, contrary to the L distribution, does not treat kurtosis as fixed. Interestingly, the AL distribution becomes even more leptokurtic compared to the L distribution in case of an excess kurtosis that varies between 3 (the smallest value for the L distribution) and 6 (the largest value for the exponential distribution). Secondly, the AL 1 distribution is skewed (for φ = ψ) which is another advantage. An enhanced flexibility of AL distributions can be achieved by changing the asymmetry and kurtosis.
Because of the large leptokurtosis of the AL 1 distribution, Stockhammar and Öller [51] added Gaussian noise and used this mixture of distribution first time on macroeconomic time series data. The Basic assumption is that each shock is an independent drawing from either an N or AL distribution. The probability density distribution of the filtered growth series (z t ) is given in Appendix 1 by a weighted sum of N and AL 1 random shocks. Note that, as in [21,24,51], equal median and unequal variances are assumed for the components. This simplifies distributional comparisons, especially in the next paragraph when the Student's t distribution is introduced in the mixture.
The mixed normal-asymmetric Laplace-1 (NAL 1 ) distribution has a jump discontinuity at μ when φ = ψ, see Figure 4. Looking at the smoothed empirical distributions in Figure 2, the discontinuity seems counterintuitive. However, the histograms in Figure 2 lend some support to a jump close to μ.
The PDF of the Student's t distribution with location parameter μ scale parameter σ and shape parameter ν (degrees of freedom) is given in Appendix 1. We have introduced a new mixture by adding Student's t distribution with AL 1 to decrease the excess kurtosis in AL 1 . To the author's best knowledge this distribution has not been used before for macroeconomic time series data.
Student's t distributions are symmetric, uni-modal, bell-shaped and leptokurtic distributions. The shape parameter determines the fatness of the tails; excess kurtosis will  decrease as the degree of freedom increases. We assume that each shock is an independent drawing from either a Student's t or an AL 1 distribution. The probability density distribution of the filtered growth series (z t ) is described in Appendix 1 by a weighted sum of student's t and AL 1 random shocks.
In mixed Student's t-Asymmetric Laplace-1 (TAL 1 ), as before, equal medians, but unequal variances, are assumed for the components in the proposed distribution. It has a jump discontinuity at μ when φ = ψ, see Figure 4.
Stockhammar and Öller [51] used the convoluted version suggested by Reed and Jorgensen [48] for Mixture of N and AL 1 distributions. Instead of using the AL 1 parameterization in TAL 1 , they used AL 2 distribution specified in Appendix 1.
We have used the AL 2 distribution to make the weighted mixture of AL 2 with normal and Student's t distribution. We assume that each shock is an independent drawing from either the N or AL 2 distribution. The probability density distribution of weighted sum of N and AL 2 (NAL 2 ) is specified in Appendix 1. Similarly, we assume that each shock is an independent drawing from either the Student's t or AL 2 distribution. The probability density distribution of the filtered growth series (z t ) can then be described by a weighted sum of Student's t and AL 2 (TAL 2 ) as given in Appendix 1. Figure 5 shows a couple of examples of the NAL 2 and TAL 2 distributions.

Estimation and assessment of distributional accuracy
In this section, we will use all six distributions in order to find out which one best fits the data. The parameters of all the distributions are estimated by using the ML method. The ML estimates for parameters of the distributions are obtained by numerical maximization of log likelihood of the distribution under a parametric constrain.
For numerical maximization of the log-likelihood, we have used the Nelder and Mead [42] method. In practice the performance of the Nelder and Mead algorithm is generally good, see Wright [57] and Lagarias et al. [32].
The log-likelihood of the NM distribution is We numerically maximize the above log-likelihood function and perform the simulation study to obtain ML estimates and standard errors of the parameters. The ML estimates and standard errors of the parameters for the NM distribution are given below: The log-likelihood functions of the NAL 1 and TAL 1 distributions are In Equations (4) and (5), I is the indicator function and ML estimate for μ is the median of the AL 1 distribution. The ML estimates and standard errors of the parameters for the NAL 1 and TAL 1 distribution are obtained by numerical maximization of the above log-likelihood functions and simulation study. Table 6 shows that the Gaussian noise component dominates. In the UK seriesψ is much smaller thanφ which indicates that the growth of shocks that are weaker than median have a smaller spread than the above median shocks. Together with a mean growth larger than zero this ensures long-term economic growth. Table 7 shows that the Student's t distribution noise component dominates for US, UK and CA GDP series. The log-likelihood (l(θ )) In both equations, I is the indicator function. The ML estimates and standard errors of the parameters for the NAL 2 and TAL 2 distributions are given in Tables 8 and 9 . This is done by performing simulation study and numerical maximization of the above log-likelihood functions.   Table 8 shows that the Gaussian noise component dominates in US and UK series and for CA series AL 2 noise component dominates and Table 9 shows that Student's t distribution components are contributing more than the AL 2 part.
All accuracy measures, RMSE, MdAPE, sMdAPE and MASE are defined in Appendix 2. For RMSE, one thousand equidistant points on the horizontal axis are taken within the range of the data. Hence we have more points where distributions are almost parallel to the x-axis thus providing more weight to these points. The sum in the expression of RMSE is taken over the ordinates of these points. For US data the peak to the left of the median significantly affects the RMSE. A lower value of RMSE indicates a better fit. This scaledependent measure is more sensitive to outliers.
Because of the advantage of being scale independent, percentage error measures are widely used to compare forecasting performance. However, These measures also have some disadvantages. They are undefined atf (z i ) = 0, and for values of f K (z i ) close to zero have an extremely skewed distribution. The MdAPE measure is better to its close relative mean absolute percentage error (MAPE) because of the asymmetry, but both MAPE and MdAPE have the disadvantage that they give heavier penalty on positive errors than on negative errors. This is the reason Makridakis [36] advocated so-called 'symmetric' measures. One of these is MdAPE which is described in Appendix 2. Another commonly used measure is the MASE defined in Appendix 2. Hyndman and Koehler [19] showed that this measure is less sensitive to outliers and perform better for small samples than other measures. It is widely applicable and easily interpretable. They suggested that MASE was the best available measure of forecast accuracy. All the above five measures are reported in Table 10. For the US series, the TAL 2 distribution using the parameter values in Table 9 is superior to the N, NM, NAL 1 , NAL 2 and TAL 1 distributions according to each measure. The TAL 2 fit is on average 12.0 %, 13.4 %, 37.2 %, 6.9 % and 37.4% better comparing to the N, NM, NAL 1 , NAL 2 and TAL 1 distribution , respectively. Using the estimated parameters in Table 5, the NM distribution is superior to other distributions for the UK GDP series according to all measures except the RMSE. The NM distribution shows on average a 12.9 %, 27.7 %, 20.1 %, 32.2 % and 17.0% better fit comparing to the benchmark N distribution, NAL 1 , NAL 2 , TAL 1 and TAL 2 , respectively, for the UK GDP series. Finally for the CA GDP series the NM shows on average a 50.6 %, 47.9 %, 53.9 %, 48.4 % and 51.7% improvement as compared to the benchmark N distribution, NAL 1 , NAL 2 , TAL 1 and TAL 2 , respectively. According to this numerical comparison, the US GDP series could be viewed as samples from a TAL 2 whereas UK and CA GDP series from the NM distribution with parameter estimates in Tables 9 and 5 respectively.
Kernel estimation and goodness-of-fit tests are usually based on subjective choices both of function and of bandwidth. Tests which are based on either of these approaches have  lower power which is an established and well-known fact. We used the KS, AD, CVM, U 2 and χ 2 tests to evaluate how likely it was that the observed sample could have been generated from the distribution in question for the US, UK and CA GDP series. The χ 2 test is sensitive to the subjective choice of the number of bins and does not have much power. The KS, AD, CVM, V and U 2 goodness-of-fit tests are based on the empirical distribution function (EDF) and are often referred to as EDF tests. EDF tests are more powerful than χ 2 goodness-of-fit test, see D' Agostino [8], Kotz and Nadarajah [28] and Famoye [13]. The AD and CVM are the most powerful tests among the EDF tests; see Kotz and Nadarajah [28] and Famoye [13].   All goodness-of-fit test statistics are defined in Appendix 3. The KS test statistic is defined as the maximum value of the absolute difference between the empirical CDF of data and the theoretical CDF of the distribution. The AD test is a modification of the CVM test. This test gives more weight to the tails than the KS test. The V test is more closely related to KS test. This test is invariant under cyclic transformations of the independent variable and provides equal sensitivity at the tail as the median. Table 11 reports on the p-values of KS, AD, CVM, V, U 2 and χ 2 tests when testing the null hypotheses H 0,1 : y * ∼ N, H 0,2 : y * ∼ NM, H 0,3 : y * ∼ NAL 1 , H 0,4 : y * ∼ NAL 2 , H 0,5 : y * ∼ TAL 1 H 0,6 : y * ∼ TAL 2 and for the US, UK and CA series.
The result presented in Table 11 clearly shows that, considering p-values of all the goodness-of-fit tests for the US GDP series, the TAL 2 fits better compared to other distributions, whereas NAL 2 has second best fit. For the UK GDP series, NM fits the data better compared to other distributions, except for CVM and U 2 test according to which TAL 2 fits the data best. Finally, for CA series NM fits the data better compared to other distributions according to all goodness-of-fit test except the χ 2 test.
We can clearly see from Figure 6 for the US GDP series TAL 2 density is closer to the Kernel density as compared to other distributions, and in the Q-Q plot points are closer to the 45 • line y = x. This confirms Tables 10 and 11 that the TAL 2 distribution fits better to US GDP data compared to other distributions. Figure 7 shows that the NM density is closer to the Kernel density compared to other distributions. In the Q-Q plot the theoretical quantiles from NM, using the estimated parameters in Table 5.1, are close to the line y = x, indicating that NM fits the data better compared to other distributions. This confirms the result showed in Tables 10 and 11.
In Figure 8, the NM density is closer to the Kernel density compared to other distributions. The theoretical quantiles of the NM, using the ML estimates from Table 5, are closer to reference line in the Q-Q plot and indicates that the NM fits the data better. This supports the results in Tables 10 and 11.

Conclusions
The growth rate of GDP in the US, UK and Canada was found to exhibit heteroscedasticity, leptokurtosis (fat tails) and skewness (asymmetry around the mean). Because of this, the standard assumption of normality is not valid. In this paper we are trying to bring density about the true distribution of the GDP growth. Heteroscedasticity was removed prior to the distributional comparison by using the filter proposed by Stockhammar and Öller [52].
The Laplace distribution and the asymmetric Laplace distribution are unable to explain the asymmetries and slight leptokurtic shape of the filtered series. A mixed Student's t-Asymmetric Laplace-2 (TAL 2 ) distribution is introduced. For the US GDP, which is more skewed and leptokurtic than the other series studied, the TAL 2 -distribution is shown to better describe the density distribution of growth than the N, NM, NAL 1 , NAL 2 , TAL 1 and L distributions. In the TAL 2 distribution, the Student's t distribution component was dominant. For the UK and Canada GDP series, where data were skewed but only slightly leptokurtic, the NM distribution showed better fit.
The TAL 2 implies a breakdown of the shocks into AL 2 and Student's t components, and NM implies a breakdown into two normally distributed components. The six parameters of TAL 2 and the five parameters of NM are able to describe the mean, variance, skewness and kurtosis of the data. The ML estimates of the parameters of the distributions were estimated by maximization of the log likelihood using the Nelder and Mead method.
Because of the close distributional fit, the TAL 2 and NM distributions are suitable choices for density forecasting and should prove more correct than the Normal distribution. These distributions could also prove useful in density forecasting of any heteroscedastic, asymmetric and leptokurtic time series, not only GDP growth series.

The Kernel estimate is defined aŝ
where k(·) is the Kernel function and h is the bandwidth parameter. In this study, we have used the Gaussian Kernel, and the Silverman [50] Rule of Thumb bandwidtĥ h = 4σ 5 3n 1/5 ≈ 1.059σ n −1/5 , which is considered to be optimal when data are close to normal as the case here.