Likelihood ratio gradient estimation for Meixner distribution and Lévy processes

We address the problem of gradient estimation with respect to four characterizing parameters of the Meixner distribution and Lévy process. With the help of the explicit marginal probability density function, the likelihood ratio method is directly applicable, while unbiased estimators may contain infinite random series in their score function. We quantify the estimator bias arising when the infinite series is truncated to finite term. We further propose a substantially simple exact simulation method for the Meixner distribution, based on acceptance-rejection sampling and the Esscher density transform. Numerical results are presented in the context of financial Greeks to illustrate the effectiveness of our formulas along with bias estimates.

uniform random variables and evaluating the Meixner probability density function, without sampling directly from the density function.
The rest of this paper is organized as follows. Section 2 is devoted to a review of basic facts on the Meixner distribution and Lévy processes, which we use throughout this paper. In Sect. 3.1, we derive unbiased estimators for parameter sensitivity indices on the Meixner distribution and Lévy process. The derivation of our formulas entails rather lengthy proofs of somewhat routine nature. To avoid overloading the paper, we omit nonessential details in some instances. In Sect. 3.2, a simple exact simulation method for the Meixner distribution is proposed for Monte Carlo estimation purposes. Some numerical results are presented in Sect. 4 in the context of Monte Carlo estimation of financial Greeks to illustrate the effectiveness of our formulas along with bias bounds.

Preliminaries
Let us begin this preliminary section with the notations which will be used throughout the paper. We denote by R d the d-dimensional Euclidean space with the norm · , and |·| for one-dimension. We use the notation i := √ −1 and let N denote the collection of positive integers. We denote by the transpose. We fix ( , F, P) as our underlying probability space. We denote the Gamma function by (z) := +∞ 0 x z−1 e −x dx, where z is a complex number with a positive real part. We denote by ∼ the asymptotic equivalence up to constant.
This Lévy process is called the Meixner process. With the aid of (2.1), it holds that for each σ > 0 and t > 0, One of the remarkable properties of the Meixner process is its asymptotic behaviors with respect to observation time, just like normal inverse Gaussian processes, tempered stable processes of Rosiński (2007) and layered stable processes of Houdré and Kawai (2007). On the one hand, over short time intervals, it approximates a stable process; as h ↓ 0, a scaled Meixner process tends to a standard Cauchy (Lévy) process, where the convergence is in the weak sense of random processes in the space of càdlàg functions from [0, +∞) into R equipped with the Skorohod topology. In a long time frame, on the other hand, it is close to a Brownian motion; as h ↑ +∞, another scaled Meixner process approaches to the standard Brownian motion. These stable-type and Gaussian-type behaviors have long been considered to be very appealing in various applications, for example, to model turbulence in statistical physics, asset price dynamics in mathematical finance, network dynamics in transportation management and computer science, and animal movement patterns in mathematical biology. (Those can be proved rigorously in a similar manner to Houdré and Kawai (2007), Rosiński (2007)).

Main results
Throughout this section, we let X be a random variable such that L(X ) ∼ Meixner(α, β, δ, μ) and let be a measurable function mapping from R → R such that where E θ indicates the expectation taken under the Meixner(α, β, δ, μ) distribution. For convenience, we use the notation It is important that no regularity of the function is imposed, as we apply the likelihood ratio method.

Estimation formulas and error analysis
For each n ∈ N, define (3.5) Note that the condition (3.1) ensures that for each n ∈ N, the expectations (3.2)-(3.5) are all well defined. The following provides unbiased estimators for sensitivity index with bias estimates.
Theorem 3.1 It holds that for each n ∈ N, In particular, it holds that Proof Throughout the proof, we write x := (x − μ)/α. (Note that x depends on the parameters α and μ although we suppress those for ease of notation). The log-likeli-hood function is given by Recall that for z ∈ C with Re(z) > 0, the definition of the digamma function where z ∈ C with Re(z) > 0 and where γ := lim n↑+∞ ( n k=1 k −1 − ln n) ≈ 0.5772 the Euler-Mascheroni constant, and that for each a > 0 and b ∈ R, Since the density function f (x; θ) takes only real values, it holds that for each k = 0, 1, 2, With the help of those properties, a tedious yet straightforward computation yields where the second equality holds by the fact that f (x; θ) > 0 over R and the interchange of the derivative and the integration can be justified as follows. It holds by the Taylor theorem that for h ∈ R 4 such that h = 1 and a fixed constant ε > 0 satisfying θ + εh ∈ , where the last inequality holds by the condition (3.1) and the linear growth of the supremum in variable x. This justifies the interchange by the dominated convergence theorem.
Finally, upper and lower bounds for the infinite sums in (3.6) can be derived by an elementary calculus as follows. Observe that for each x ∈ R and n ∈ N, By evaluating the integral, we get In a similar manner, we get Note that for each n ∈ N, all the bounds are uniformly bounded in x over R. The proof is complete.
If the function is non-negative, more precise bias bounds can be derived.
Corollary 3.2 Under the setting of Theorem 3.1, if is non-negative, then it holds that for each n ∈ N, The above formulas can directly be applied to the so-called Greeks in mathematical finance. Let {X t : t ≥ 0} be a Meixner process such that L(X 1 ) = Meixner(α, β, δ, μ). Fix S 0 > 0, r ≥ 0, σ > 0 and T > 0. Define an asset price dynamics {S t : t ≥ 0} driven by this Meixner process by with {e −rt S t : t ≥ 0} being a martingale with respect to the natural filtration generated by {X t : t ≥ 0}. In mathematical finance, the sensitivity indices below are called delta, vega, theta and rho, respectively.

Corollary 3.3 Let Z be a random variable with
Proof With the help of the relation (2.4), we get where the random variable Z is defined as before. The rest is straightforward.

A simple simulation method
Given estimation formulas, such as (3.2)-(3.5), it remains to generate a sequence of iid random variables with common Meixner(α, β, δ, μ) distribution. The most straightforward approach seems to be the inversion method with the use of the explicit density function (2.2) (See, for example, Imai and Tan Imai and Tan (2009) for a numerical inversion method of the Meixner distribution in connection with the quasi-Monte Carlo method). An exact simulation method based on acceptance-rejection sampling is proposed in Grigoletto and Privasi Grigoletto and Provasi (2009). This acceptancerejection sampling method employs the Johnson translation system and requires some amount of initial numerical computations. Apart from general parameter settings, it holds that when (α, δ) = (1, 1/2) is fixed, This identity in law is very useful for simulation, where G 1 and G 2 are independent gamma random variables, respectively, with parameters (1/2 + β/(2π), 1) and (1/2 − β/(2π), 1) (See Yano et al. (2009)). Here, fixing δ = 1/2 is way too restrictive since in principle, this parameter corresponds to the time of the associated Lévy process (In contrast, fixing α = 1 is not at all, due to the scaling property (2.4)).
In this subsection, we propose a simple exact simulation method for arbitrary Meixner distributions. Our method consists of two components of acceptance-rejection sampling and the Esscher density transform. Let us begin with the acceptance-rejection sampling method. It holds by the well known result of Devroye (1981) and that Let U 1 and U 2 be iid uniform random variables on (−1, +1) and define V := √ C 2 (θ )/C 1 (θ )U 1 /U 2 . The random variable V admits a probability density (C 3 (θ )) −1 q(x; θ). Based on the above facts, we can employ a simulation method based on acceptance-rejection sampling as follows.
Step 3. If C 1 (θ )U < f (V ; θ), then exit with X (θ ) ← V . Otherwise, go to Step 1. This is an exact simulation algorithm, that is, L(X (θ )) = Meixner(α, β, δ, μ). The acceptance rate at Step 1 of Algorithm 1 is C 3 (θ ) −1 , while the expected number of times Step 1 is executed is thus C 3 (θ ). (See Kawai and Masuda (2011b) for an application of this algorithm to the tempered stable distribution). It however seems difficult to find the quantities C 1 (θ ) and C 2 (θ ) in closed form for arbitrary θ . The simulation method we propose avoids the initial numerical integration for the integrals (3.13).
In what follows, we fix α and δ arbitrarily, and use the notation θ (β,μ) := (α, β, δ, μ) for convenience. The following results play an essential role. In short, when the distribution is symmetric, that is β = 0, the above constants are more tractable.
Using those results, the algorithm is reformulated as follows. Observe that where q(x; θ (0,0) ) is defined in (3.12). As for Algorithm 1, the quantity C 5 (θ (0,0) ) −1 acts as the acceptance rate of our algorithm. Note that the constant C 5 (θ (0,0) ) depends only on δ. Define where U 1 and U 2 are again independent uniform random variables on (−1, +1). The algorithm is as follows.
The acceptance rate at Step 1 of Algorithm 2 is given by C 5 (θ (0,0) ) −1 and thus the expected number of times Step 1 is executed is C 5 (θ (0,0) ). It is discussed in Devroye (1981) that the quantity C 2 (θ ) in (3.13) is minimized when the random variable to be simulated is set centered. In fact, this applies implicitly to our case (β = 0) as well; the upper bound given in (3.14) is minimal with respect to μ. Clearly, fixing μ = 0 is not a restriction at all as the original location μ can easily be summed back afterwards. To illustrate how Algorithm 2 works, we provide Fig. 1. The key quantity is the acceptance rate C 5 (θ (0,0) ) −1 illustrated in the left figure. The acceptance rate takes the global maxima of 2 at δ = 1/2, while it tends monotonically to 0 as either δ ↓ 0 or δ ↑ +∞ (Recall that this acceptance-rejection sampling method is not quite of use when δ = 1/2 due to the convenient identity (3.11)). Next, as drawn in the center and the right, the distance between the true f (x; θ (0,0) ) and the (non-standardized) proposal p(x; θ (0,0) ) roughly tells the efficiency of the algorithm. Loosely speaking, the algorithm is efficient when those two are close.

Numerical illustrations
We provide illustrative numerical results to support our theoretical analysis. Rather than presenting exhaustive Monte Carlo experiments, we first focus on estimation of the delta given in (3.9) with an indicator function (x) = 1(e x > K ) for some K > 0, which corresponds to a digital option in the context of mathematical finance. We take this example first for more precise comparison purpose, as the sensitivity with respect to the initial state S 0 reduces to the closed form where θ = [θ 1 , θ 2 , θ 3 , θ 4 ] is given by (3.8). In contrast, we have obtained the estimation formula n (1(e x > K ); θ), due to Theorem 3.1 (i). Since (x) = 1(e x > K ) is non-negative, we can employ the bias bounds given in Corollary 3.2, that is, In other words, it holds that Note that C + n > 0, C − n < 0 and lim n↑+∞ C + n = lim n↑+∞ C − n = 0. Throughout this section, we fix the model parameters (α, β, δ) = (0.3977, −1.4940, 0.3462) and (σ, r, T, S 0 , K ) = (1, 0.05, 1.0, 100, 85). We draw in Fig. 2 Monte Carlo convergence in estimation of e −r T S 0 (4) n (1(e x > K ); θ) with n = 3, 5, 10. The quantities C + n and C − n are estimated by Monte Carlo simulation as well. All the Monte Carlo simulations are performed with Algorithm 2 and the Esscher transform (3.16). It can be observed that as n increases, the limiting value e −r T S 0 (4) n (1(e x > K ); θ) approaches to the true value e −r T S 0 f (ln K ; θ). As is often the case, the larger truncation n requires a more amount of computation in return for a smaller bias. It seems that the theoretical bounds are sufficiently tight at early stage of the truncation, where computational load is still reasonable.
Next, we test the formulas in Corollary 3.3 and draw Monte Carlo convergence in Fig. 3 for the standard European call option, more relevant in financial applications, under the same parameter setting (We do not consider the sensitivity (∂/∂r ) with respect to the risk-free rate, as it does not involve any of our results). In our formulation, this corresponds to (x) = max(e x − K , 0) =: (e x − K ) + . Observe that e x (1 ∨ |x|) f (x; θ ) ∼ x 2δ exp 1 − π +β α x , as x ↑ +∞, 0, as x ↓ −∞, where we have used the semi-heavy tails (2.3). Under our parameter setting, those tails are obviously integrable. Unlike the case of the digital option, it is difficult to derive explicit formulas for the sensitivities here. We thus estimate the true value through the standard central finite difference method, for instance, with some small positive ε. As the max function has a non-differentiable point and the perturbation ε is strictly positive, the above right hand side is an approximation (with some bias) of the left hand side. Here, we fix ε = 0.001. As earlier, the payoff function (x) is non-negative, we employ the bias bounds of Corollary 3.2 (In these instances, the upper bounds are very close to, or even exactly same as, the corresponding true values). Let us emphasize that our formula for (∂/∂ T ) is of particular interest, as the finite difference method for theta (∂/∂ T ) is significantly expensive in the sense that the two random variables X T +ε and X T −ε have to be generated separately, while the parameters S 0 and σ are purely exogenous.