A new method for estimating Sharpe ratio function via local maximum likelihood

The Sharpe ratio function is a commonly used risk/return measure in financial econometrics. To estimate this function, most existing methods take a two-step procedure that first estimates the mean and volatility functions separately and then applies the plug-in method. In this paper, we propose a direct method via local maximum likelihood to simultaneously estimate the Sharpe ratio function and the negative log-volatility function as well as their derivatives. We establish the joint limiting distribution of the proposed estimators, and moreover extend the proposed method to estimate the multivariate Sharpe ratio function. We also evaluate the numerical performance of the proposed estimators through simulation studies, and compare them with existing methods. Finally, we apply the proposed method to the three-month US Treasury bill data and that captures a well-known covariate-dependent effect on the Sharpe ratio.


Introduction
In financial analysis, the Sharpe ratio is one of the most popular measures about the riskadjusted return, which is defined as the difference between the return of an investment and the risk-free return, divided by the volatility (or standard deviation) of the investment.The Sharpe ratio is now commonly used as a gold standard to compare different assets or trading strategies, where the one with a higher Sharpe ratio provides a better return for the same risk.Moreover, it has also been extended to many other contexts, including performance attribution, tests of market efficiency, and risk management [18,21].On the other hand, however, a static Sharpe ratio with a constant standard deviation may oversimplify the risk due to the serial correlation or the phases of business cycle [18].This thus motivates to consider the covariate-dependent Sharpe ratio function, also referred to as the Sharpe ratio function, which can provide extra evidence to the fundamental economics underlying the economy and asset pricing; see [16,17,19,25].
Model (1) is closely related to the following continuous-time diffusion process model: where μ(•) and σ (•) are the drift and diffusion functions of the process {X t } respectively, and W t is the standard Brownian motion that is used to model the stochastic behavior of economic variables, including, for example, interest rates, exchange rates, and stock prices; see [1,2,23].By employing a discrete-time approximation to the continuous-time process, model (2) will reduce to model (1).In Section 7, we analyze the three-month US Treasury bill data from the secondary market.The three-month US Treasury bill rate is the yield received for investing in a government-issued treasury security that has a maturity of three months.The three-month Treasury yield is included on the shorter end of the yield curve and is important when looking at the overall US economy.The secondary market rates are annualized using a 360-day year of bank interest and quoted on a discount basis.Specifically, the rates are calculated as unweighted averages of closing bid rates quoted by at least five dealers in the secondary market.The rates are posted on a bank discount basis, but are converted into continuously compounded yields prior to analysis.Andersen and Lund [2] applied model (2) to analyze this dataset from 5 January 1962 to 31 March 1995, where μ and σ are, respectively, the instantaneous expected rate of return and the volatility.We focus on estimating the covariate-dependent Sharpe ratio f = μ/σ in Section 7. Suppose that {(Y i , X i ) : i = 1, . . ., n} are independent and identically distributed (i.i.d.) data from model (1).To estimate the Sharpe ratio function f (x), most existing methods take a two-step procedure that first estimates the mean function μ(x) and volatility function σ 2 (x) separately and then applies the plug-in method.Nevertheless, such a two-step procedure is often less efficient since two smoothing parameters have to be involved for estimating μ(x) and σ 2 (x) individually.In addition, when one is also interested to estimate the first-or higher-order derivative of f (x), the indirect methods will often be difficult to implement.Recently, [16] proposed a direct maximum likelihood estimation with a roughness penalty for the Sharpe ratio function based on a parameterization of the likelihood in terms of f (x) = μ(x)/σ (x) and the inverse volatility function 1/σ (x) when the random error ε is normally distributed.Moreover, by reparameterizing the volatility function as g(x) = − log{σ (x)}, [17] proposed to estimate f (x) and g(x) iteratively based on the local linear method.
In this paper, we propose a general framework using the local maximum likelihood to jointly estimate f (x) and g(x) as well as their derivatives.Compared to [17], our new framework has several advantages.First, our joint estimation only needs one bandwidth for simultaneously estimating f and g.Second, our proposed method can also be used to estimate the derivatives of the Sharpe ratio function and apply to the non-Gaussian distribution of ε as well.Third, by establishing the joint limiting distribution for the proposed estimators, our new framework can also optimally estimate the conditional mean function μ(x) and the conditional variance function σ 2 (x).In particular, the leading terms on the asymptotic bias and variance of our new estimator for σ 2 (x) will be the same as those in [30].
The rest of the paper is organized as follows.Section 2 describes the estimation procedure of f (x) and g(x) or their derivatives by combining the maximum likelihood estimation and the local polynomial smoothing.Section 3 establishes the joint limiting distribution for the local maximum likelihood estimators induced by the normally distributed random errors.Section 4 proposes new estimators for μ(x) and σ 2 (x) and derives their joint limiting distribution.Section 5 extends the proposed method to estimate the multivariate Sharpe ratio function.Section 6 conducts extensive simulations to assess the finite sample performance of the proposed estimators.We will also mention the rule of thumb and the leave-one-out cross-validation for the bandwidth selection.Section 7 analyzes a threemonth US Treasury bill data using our new method.Section 8 concludes the paper, and the technical results, including the proofs of theorems, are given in the online supplemental material.

Estimation method
In this section, we propose to estimate the Sharpe ratio function f (x) and the negative log-volatility function g(x) simultaneously, where Note that the logarithm transformation in g(x) removes the positivity constraint on the volatility function, as also adopted by [17,30,32].Moreover, unlike most existing literature, we do not impose the normality assumption on the random errors so as to be more realistic.
For simplicity, let the density function of ε be ψ(•).Then accordingly, the density function of for any μ ε ∈ R and σ ε > 0, and by which the conditional density function of Y given where the right-hand side is reparameterized with f and g defined in (3).Moreover, the log-likelihood function of the sample {(Y i , X i ) For each given point x, we use the local polynomial smoothing [10] to estimate the νth derivatives f (ν) (x) and g (ν) (x).Suppose that the (p + 1)th derivatives of f and g exist at point x, where p ≥ ν.Then for any X i in a neighborhood of x, we can approximate f (X i ) and g(X i ) locally by By the above approximations, together with Equation ( 4), leads to a local maximum likelihood problem that minimizes with h > 0 as the bandwidth, and Let also ( β, γ ) = arg min β,γ ∈R p+1 n (β, γ ).It then yields the local polynomial estimators of f (ν) (x) and g (ν) (x) as where e ν+1 is the standard unit vector with 1 in the (ν + 1)th component and 0 elsewhere.For illustration, we also provide two examples with the error distribution given, as well as derive their negative local log-likelihood functions.
Example 2.1: When ε follows the standard normal distribution, by ignoring additive constants, we have Example 2.2: When ε follows the Laplace distribution, i.e. ψ(ε) = 2 −1/2 exp(− √ 2|ε|), by ignoring additive constants we have In the remainder of the paper, for simplicity, we will only consider the estimates of f (ν) (x) and g (ν) (x) obtained by minimizing the negative local log-likelihood (6), i.e.where ε is normally distributed.Future research is needed for the joint estimation that minimizes (5) for a general error distribution or (7) for the Laplace error distribution.
For the special case of ν = p = 0, one can apply the local constant smoothing to estimate the Sharpe ratio function and the resulting estimator is f NW (x) = β0 , where By simple calculations, the above optimization problem will yield an explicit solution as , where Note that μNW (x) is the Nadaraya-Watson estimator of μ(x), and proposed by [14].This shows that the direct estimator f NW (x) is a special two-step estimator in which the estimates of μ(x) and σ 2 (x) used a common bandwidth and a common kernel function.

Asymptotic properties
In this section, we derive the joint limiting distribution of f (ν) (x) and ĝ(ν) (x) obtained by minimizing the negative local log-likelihood (6).For each presentation, we first introduce some notations.Let ϕ(•) be the marginal density function of the covariate X.Let where The proof of Theorem 3.1 is given in the Supplemental Material.When p − ν is an even number, Theorem 3.1 still holds.But in this case, we have e ν+1 S −1 c p = 0 owing to the symmetry of the kernel K so that a term of order O(h p+2−ν ) will appear in the asymptotic bias.For details, see Theorem 3.1 in [10].While for choosing the order p of the polynomial, [10] further demonstrated that the polynomial matching an odd number of p − ν is better than the polynomial matching an even number of p − ν.In view of this, we will focus only on an odd p − ν in the remainder of the paper, in particular for p = ν + 1.Moreover, Condition (C4) guarantees that the proposed estimation and Theorem 3.1 hold no matter whether or not the random error is normally distributed.In this special case when ε ∼ N(0, 1), (x) can be simplified as In practice, it is often of particular interest to estimate the Sharpe ratio function f (x).Specifically, for ν = 0, we can apply the local linear fitting with p = 1.Let the local linear estimators of f (x) and g(x) be f (x) and ĝ(x), respectively.Then by Theorem 3.1, we have the following corollary.

Corollary 3.2:
Under Conditions (C1)-(C4) with p = 1, if nh 7 → 0 and nh → ∞, then for any interior point x from the support of ϕ(•), we have Corollary 3.2 provides the joint limiting distribution of { f (x), ĝ(x)}, which exhibits the correlation between the estimated Sharpe ratio and volatility functions.In contrast, Theorem 1 in [17] only provides the marginal limiting distributions for the two estimators.Moreover, the asymptotic variances of f (x) and ĝ(x) are also different from those in [17], mainly because f (x) and ĝ(x) share the same bandwidth rather than two different bandwidths in [17].
Note also that the asymptotic bias and variance of f (ν) (x) do not depend on g(x) and its derivatives.This implies that g(x) can be regarded as a nuisance function when estimating f (x) and its derivatives.Define the equivalent kernel by Note that (−1) ν K * ν is a kernel of orders (ν, p + 1) as defined by [12].It can be readily shown that e ν+1 S −1 c p = u p+1 K * ν (u) du and e ν+1 S −1 S * S −1 e ν+1 = K * 2 ν (u) du.Also, by Theorem 3.1, the asymptotic mean squared error (AMSE) of where .
Moreover, one may also consider a constant bandwidth by minimizing the asymptotic mean integrated squared error AMISE( dx with a weight function w ≥ 0. Then according, it yields the optimal constant bandwidth as Finally, as claimed in [24], the proposed local polynomial estimator f (ν) (x) with bandwidth h = h opt (x) or h opt enjoys the optimal rate of convergence n −(p+1−ν)/(2p+3) .
In other words, our proposed estimator for the higher-order derivative may be unreliable or unstable due to the slow convergence rate, especially when n is small.In view of this, we will apply the regression bootstrap [13] to construct the point-wise confidence band for f (x) in Section 7. Also as mentioned in Section 8, an interesting future direction is to construct a simultaneous confidence band for f (x).

Simultaneous estimation of μ(•) and σ 2 (•)
As a by-product, the proposed estimation in Section 2 can also be used to estimate the conditional mean function μ(x) and the conditional variance function σ 2 (x), although they are not the main focus of this paper.Specifically, noting that μ(x) = f (x) exp{−g(x)} and σ (x) = exp{−g(x)}, we can estimate them by The following theorem establishes the joint limiting distribution of μ(x) and σ 2 (x).
Theorem 4.1: Assume that the second derivatives of μ(x) and σ 2 (x) exist and are continuous in a neighborhood of an interior point x from the support of ϕ(•).Then under Conditions (C2)-(C4), if nh 7 → 0 and nh → ∞, we have where .
The proof of Theorem 4.1 is given in the Supplemental Material.In the special case when E(ε 3 ) = 0, e.g. when the error density is symmetric about zero, Theorem 4.1 shows that μ(x) and σ 2 (x) are asymptotically independent.Theorem 4.1 also provides the asymptotic biases and variances of μ(x) and σ 2 (x).Specifically, for μ(x), the leading term in the asymptotic variance is which is the same as that of the local linear estimator for μ(x) in [10].While for σ 2 (x), we compare it with the residual-based estimator of σ 2 (x) proposed by [11].From Theorem 4.1, the leading terms in the asymptotic bias and variance are, respectively, bias{ σ 2 (x)} : −σ 2 (x)g (x)μ 2 h 2 and Var{ σ 2 (x)} : 1 nh which are the same as those in [30].Let also μLL (x) be the local linear estimator of μ(x) with a bandwidth h 1 > 0 and ri = {Y i − μLL (X i )} 2 .Then by [11], the estimator of σ 2 (x) is given as σ 2 FY (x) = α0 , where Moreover, it also follows from [11] that which shows that the two estimators σ 2 (x) and σ 2 FY (x) have exactly the same asymptotic variance.

Multiple covariate case
This section extends the proposed method in Section 2 to the case of multiple covariates.Let X i = (X i1 , . . ., X id ) be the d-dimensional covariate vector with the marginal density function ϕ.We consider the multivariate non-parametric regression model where ε i are i.i.d.random variables with zero mean and unit variance and are independent of X i .For model (12), the existing literature has mainly focused on the estimation of μ(•) and/or σ 2 (•).To name a few, [7] proposed a tensor-product polynomial spline estimator for μ(•) and showed that it achieves the optimal rate of convergence as defined in [24]; [20] proposed a local linear kernel-weighted least squares estimator for μ(•); and [6] established a minimax rate of convergence for estimating σ 2 (•) and showed that it can be achieved by a difference-based estimator.
In what follows, we estimate the multivariate Sharpe ratio function f (x) = μ(x)/σ (x) based on the sample data {(Y i , X i ) : i = 1, . . ., n}.Let K be a d-variate non-negative kernel function, and define where B is a non-singular d × d bandwidth matrix and |B| represents its determinant.For simplicity, one can take, for example, a diagonal bandwidth matrix B = diag{h 1 , . . ., h d } with h i > 0, i = 1, . . ., d.We further use the local linear smoothing to estimate f (x) and the nuisance function g(x) = log{1/σ (x)} simultaneously.When ε i follows the standard normal distribution, by (6) we can define the local linear estimators of f (x) and g(x) as f (x) = â0 and ĝ(x) = b0 , respectively, where As a potential application, [25] considered the estimation of the conditional Sharpe ratio for the market returns.In their analysis, both the mean and volatility functions of stock market returns were modeled as functions of four predetermined financial variables, including the Baa-Aaa spread, the commercial paper-Treasury spread, the one-year Treasury yield, and the dividend yield.The collected data were monthly and covered the period from April 1953 to November 2010.For more details, see also Section 3.1 of [25].Lastly, to establish the joint limiting distribution of the estimators f (x) and ĝ(x), we need the following regularity conditions.
where H f (x) and H g (x) are the Hessian matrices of f and g evaluated at x respectively, ν 0 (K) = K 2 (u) du, and .
The proof of Theorem 5.1 is given in the Supplemental Material.When the covariate is univariate with d = 1, Theorem 5.1 reduces to Corollary 3.2.We note, however, that due to the curse of dimensionality, estimating the multivariate Sharpe ratio function may require a large sample size, especially when d is very large.

Bandwidth selection
Bandwidth selection is an important issue in local polynomial modeling to balance the trade-off between the estimation bias and variance.For the bandwidth h in f (ν) (x), we follow the same rule of thumb (ROT) as recommended by [10,30].The idea is to substitute the unknown quantities in the expression of the asymptotically optimal constant bandwidth (9).Specifically, we first fit two polynomials of order p + 3 globally for f (x) and g(x) using the log-likelihood function (4) and they yield the fitted curves as Taking ε ∼ N(0, 1) and w(x) = ϕ(x)w 0 (x) for some specific function w 0 , the two terms 11 (x)w(x)/ϕ(x) dx and {f (p+1) (x)} 2 w(x) dx can be estimated by {0.5 f 2 (x) + 1}w 0 (x) dx and n −1 n i=1 { f (p+1) (X i )} 2 w 0 (X i ), respectively.Finally, by (9) we have the ROT bandwidth as .
When estimating the Sharpe ratio function f (x), an alternative method for selecting the bandwidth is to apply the leave-one-out cross-validation (LOOCV) as in [17].Specifically, the LOOCV bandwidth is given as ) are the estimators of f and g based on the bandwidth h with the ith observation excluded.However, the LOOCV cannot be used to select the bandwidth for the derivatives of f (x).In what follows, we compare the impact of the two bandwidths on the estimation of the Sharpe ratio function.

Simulations for the Sharpe ratio function
In this section, we conduct simulation studies to investigate the finite sample performance of the proposed estimator for the Sharpe ratio function and compare it with existing methods.For a fair comparison, we consider the same two models as studied by [11,16,28].To evaluate the estimation accuracy of different estimators, we also define the root integrated squared error (RISE) of f (x) as Example 6.1: Following [11,16], we simulate 100 random samples of size n from the model where ∼ N(0, 1).Accordingly, the Sharpe ratio function is We also consider four different sample sizes, n = 100, 200, 350, or 500, and four different values of the coefficient, a = 0.5, 1, 2, or 4. For each setting, we estimate f (x) by the local linear fitting as described in Section 2 with the Epanechnikov kernel and the ROT or LOOCV bandwidth proposed in Section 6.1.We then compare the new estimator with the residual-based estimator [11], the difference-based estimator [3, r = 1], and the joint estimator [16].For the latter three, we use the R and MATLAB codes available at https://github.com/won-j/joint_estimprovided by [16].The mean and standard deviation of RISEs for the five estimators are summarized in Table 1.It is worth noting that there are some discarded values (NaNs) in computing the RISEs for the residual-and differencebased methods, since the non-negativity constraint of the estimated variance functions is not imposed by these two methods.From Table 1, for each combination of a and an estimator of f (x), the mean and standard deviation of RISE decrease as n increases.For each combination of n and an estimator of f (x), the mean and standard deviation of RISE increase as a increases.When a = 0.5 or a = 1, the two new estimators provide a similar performance with no matter which bandwidth is used.When a = 2 or a = 4, the estimator with the ROT bandwidth performs better than that with the LOOCV bandwidth.
It is also evident that the new and joint estimators perform better than the residual-and difference-based estimators in all settings.In addition, our new estimator will significantly outperform the joint estimator when a is large and n is small; whereas, for other settings, the two estimators are comparable.To conclude, our proposed estimator is competitive compared to the other three estimators and thus can be recommended for practical use.
Example 6.2: Following [16,28], we simulate 100 random samples of size n from the model where ∼ Unif[0, 1] and ∼ N(0, 1).Thus, the Sharpe ratio function is given by f (x) = 0.75 sin(bπx)/ (x − 0.5) 2 + 0.5.We also consider four sample sizes n = 50, 100, 200, 500, and four different values of b from 0, 4, 10 to 20.As b increases, the Sharpe ratio function f (x) gets rougher and thus estimating it becomes more and more difficult.For each setting, we then compare the proposed estimator with the three existing ones as in Example 6.1.Table 2 presents the mean and standard deviation of RISEs for the five estimators.For each combination of b and an estimator of f (x), the mean and standard deviation of RISE decrease as n increases.
For each combination of n and an estimator of f (x), the mean and standard deviation of RISE increase as b increases, which verifies that estimating f (x) gets worse as b increases.
The two new estimators provide a similar performance with no matter which bandwidth is used.Note also that the new and joint estimators perform better than the residual-and difference-based estimators in all settings except for the combinations of n = 50 and a = 4, 10, or 40.Moreover, our new estimator is always the best when b = 0.

Simulations for the first-order derivative
In this section, we investigate the finite sample performance of the proposed estimator for the higher-order derivative of f (x).We only focus on the estimation of the first order derivative f (x) since estimating the second or higher-order derivative is more complex.Specifically, we estimate f (x) by minimizing (6) with p = 2, the Epanechnikov kernel, and the ROT bandwidth.To the best of our knowledge, there is no literature on estimating the first derivative of the Sharpe ratio function.Note that f (x) = {μ (x)σ (x) − μ(x)σ (x)}/σ 2 (x).To compare with our new estimator f (1) (x), we consider the indirect estimator ,  where σFY (x) = σ 2 FY (x), σ FY (x) = σ 2 FY (x), μ LL (x) is the local linear estimator of μ (x), and σ 2 FY (x) = α1 is the local linear estimator of dσ 2 (x)/dx defined in (11).To evaluate the estimation accuracy of the estimators, we also consider the root integrated squared error (RISE) of f (1) (x) as We consider to estimate f (x) in Example 6.1 with a = 0.5 or 1 and in Example 6.2 with b = 0 or 4 in Section 6.2, respectively.It is worth noting that there are some discarded values (NaNs) in computing the RISEs for the indirect method.Tables 3 and 4 present the mean and standard deviation of RISEs for the two estimators in Examples 6.1 and 6.2.From Table 3, for each combination of a and an estimator of f (x), the mean and standard deviation of RISE decrease as n increases.The same phenomena are also observed in Table 4.Moreover, our new estimator performs better than the indirect estimator in all settings in Tables 3 and 4 except for (b, n) = (4, 500).

Application to treasury bill data
We now apply the proposed method to analyze three datasets from the three-month US Treasury bill (T-bill) in the secondary market.The first dataset consists of 1735 weekly observations from 5 January 1962 to 31 March 1995, which has been previously analyzed by [2,11,16] and among others.The time series data, denoted by {z t }, are presented in the left panel of Figure 1.Following [11,16], we first fit a fifth-order autoregressive model to {z t }, which yields the AR(5) model as The residuals Y t are then plotted against X t = z t−1 in the right panel of Figure 1.We further consider a model that is a discrete-time approximation to the continuous-time diffusion process model ( 2), e.g.[2,16]: where E(ε t | X t ) = 0 and Var(ε t | X t ) = 1.Finally, we apply the proposed method with the ROT bandwidth to model (13) to estimate the Sharpe ratio function f (x) = μ(x)/σ (x).
While for comparison, we also estimate the Sharpe ratio function by three existing methods, and plot the estimated functions in Figure 2 with their associated 95% point-wise confidence bands using the regression bootstrap [13].It is evident that our new estimator and its bootstrap confidence band are more stable than the others.In particular, our new estimator can capture the well-known empirical evidence that the low-priced assets always outperform the high-priced ones since the former has a larger Sharpe ratio than the latter from Figure 2.
The second dataset, presented in the left panel of Figure 3 and denoted by {z t }, consists of 247 monthly observations from 1 January 2000 to 1 July 2020.Following [17], after fitting an AR(4) model and regressing the residuals Y t against X t = z t−1 , we obtain the model where E(ε t | X t ) = 0 and Var(ε t | X t ) = 1.The residuals Y t are plotted against X t in the right panel of Figure 3.To compare the two bandwidths in Section 6.1, we apply the proposed method with the ROT and LOOCV bandwidths to estimate the Sharpe ratio function, which yields ȟROT = 1.218 and ȟCV = 1.36.Figure 4 displays the two estimated Sharpe ratio functions and the associated 95% point-wise confidence bands.From the fitted curves, the two bandwidths perform similarly.We can observe that the Sharpe ratio  function for the T-bill has a nonlinear trend, which also shows that the low-priced assets perform better than the high-priced ones.
In order to evaluate the performance of the proposed method for a much shorter time period, we select the yields of the three-month US T-bill data from 1 January 2015 to 1 July 2020.This dataset consists of 67 monthly observations and is presented in the left panel of Figure 5, where z t denotes the time series of the yields.Following [17], after fitting an AR(2) model and regressing the residuals Y t against X t = z t−1 , we obtain the model z t − 1.2198z t−1 + 0.2301z t−2 = Y t = μ(X t ) + σ (X t )ε t , where E(ε t | X t ) = 0 and Var(ε t | X t ) = 1.The residuals Y t are plotted against X t in the right panel of Figure 5. Figure 6 plots the estimated Sharpe ratio function with the LOOCV  bandwidth and its 95% point-wise confidence band.The bandwidth selected by LOOCV is 1.33.Figure 6 also indicates that the low-priced assets perform better than the high-priced ones.

Discussion
In this paper, we propose a direct method via local maximum likelihood for estimating the Sharpe ratio function or its derivatives in the heteroscedastic non-parametric model.We further establish the asymptotic normal distribution for the proposed estimator under some regularity conditions, and show that it can be competitive compared to existing methods using simulated data and real market data.
Along with the paper, there exist a few interesting extensions.Firstly, it is of interest to investigate the asymptotic properties and numerical performance for the estimators obtained by minimizing the general negative log-likelihood (5) or (7).Secondly, note that there have been extensive studies on the construction of the simultaneous confidence band for μ(x) and/or σ 2 (x); see, for example, [4,5,27] and among others.Inspired by this, another interesting but challenging problem can be to construct a simultaneous confidence band for the Sharpe ratio function.Lastly, our new method for estimating the multivariate Sharpe ratio function in Section 5 may suffer from the curse of dimensionality.To overcome this problem, it might be necessary to impose structural assumptions on the multivariate Sharpe ratio function, e.g.single index modeling.A deeper and detailed investigation of these issues warrants further studies.

(Theorem 5 . 1 :
B1) All second-order partial derivatives of f and g are continuous in the neighborhood of x, and the density function ϕ is differentiable and positive in the neighborhood of x. (B2) The kernel function K is a compactly supported multivariate density function such that K(u) du = 1, uK(u) du = 0, and uu K(u) du = μ 2 (K)I d , where μ 2 (K) > 0 and I d is a d × d identity matrix.(B3) The sequence of bandwidth matrix B is such that each entry of BB tends to zero, n|B| → ∞, and n|B|{tr(BB )} 3 → 0 as n → ∞ with B remaining non-singular, where tr(B) stands for the trace of the matrix B. Let x be an interior point from the support of ϕ(•).Under Conditions (B1)-(B3) and (C4), we have n|B|

Figure 1 .
Figure 1.The three-month US Treasury Bill data from 5 January 1962 to 31 March 1995.Left: the raw data.Right: the residuals after an AR(5) fit is plotted against X t .

Figure 2 .
Figure 2. The estimated Sharpe ratio functions (solid lines) from the three-month US Treasury Bill data by the four methods: the new, residual-based, difference-based, and joint estimators.The dashed lines represent the associated 95% point-wise bootstrap confidence bands.

Figure 3 .
Figure 3.The three-month US Treasury Bill data from 1 January 2000 and 1 July 2020.Left: the raw data.Right: the residuals after an AR(4) fit is plotted against X t .

Figure 4 .
Figure 4.The estimated Sharpe ratio functions (solid lines) from the three-month US Treasury Bill data by the new estimator with the ROT (left) or LOOCV (right) bandwidth.The dashed lines represent the associated 95% point-wise bootstrap confidence bands.

Figure 5 .
Figure 5.The three-month US Treasury Bill data from 1 January 2015 to 1 July 2020.Left: the raw data.Right: the residuals after an AR(2) fit is plotted against X t .

Figure 6 .
Figure 6.The estimated Sharpe ratio function (solid line) from the three-month US Treasury Bill data by the new estimator.The dashed lines represent the associated 95% point-wise bootstrap confidence band.

Table 1 .
The mean and standard deviation (in parentheses) of RISEs for the five estimators of f (x) in Example 6.1.

Table 2 .
The mean and standard deviation (in parentheses) of RISEs for the five estimators of f (x) in Example 6.2.

Table 3 .
The mean and standard deviation (in parentheses) of RISEs for the two estimators of f (x) in Example 6.1.

Table 4 .
The mean and standard deviation (in parentheses) of RISEs for the two estimators of f (x) in Example 6.2.