Testing instantaneous causality in presence of non constant unconditional variance

The problem of testing instantaneous causality between variables with time-varying unconditional variance is investigated. It is shown that the classical tests based on the assumption of stationary processes must be avoided in our non standard framework. More precisely we underline that the standard test does not control the type I errors, while the tests with White (1980) and Heteroscedastic Autocorrelation Consistent (HAC) corrections can suffer from a severe loss of power when the variance is not constant. Consequently a modified test based on a bootstrap procedure is proposed. The relevance of the modified test is underlined through a simulation study. The tests considered in this paper are also compared by investigating the instantaneous causality relations between US macroeconomic variables.


Introduction
The concept of causality defined by Granger (1969) is widely used to analyze cause and effect relationships between macroeconomic and financial variables (see e.g. Sims (1972), Ashenfelter and Card (1982), Hamilton (1983), Lee (1992), Hiemstra and Jones (1994), Renault and Werker (2005), Gelper and Croux (2007)). The Granger causality has also been studied in others areas : neuroscience (see e.g. Brovelli et al. (2004), Seth (2008)), gene networks (Fujita et al. (2009)), geophysics (Reichel, Thejll and Lassen (2001)), or sociology (Deane and Gutmann (2003)) are some application domains among others. Causality relationships are often analyzed by taking into account only the past values of studied variables. In many situations the prediction of the unobserved current variables X 2t can however be improved by including the available current information of variables X 1t . In such a case the instantaneous causality relation between X 1t and X 2t is investigated (see Lütkepohl (2005, p 42)).
In the stationary VAR processes framework, the instantaneous causality is usually tested by using Wald tests for zero restrictions on the innovation's covariance matrix. Standard tools available in the commonly used softwares (see Lütkepohl and Krätzig (2004)) are based on the assumption of i.i.d. Gaussian innovations. The weight matrix of the test statistic has to be corrected by using the White type covariance matrix when the error process is assumed i.i.d. but non Gaussian (see White (1980)). In some cases models which produce nonlinear stationary processes are considered for the error terms as the GARCH or All-Pass models (see e.g. Bauwens, Laurent and Rombouts (2006) or Andrews, Davis and Breidt (2006)). These models allow to take into account some dependence in the innovations but also suppose that the unconditional variance of the innovations process is constant. In order to get a standard asymptotic distribution of the Wald test statistic in these situations, Heteroscedasticity and Autocorrelation Consistent (HAC) corrections can be used (see Den Haan and Lievin (1997) for the HAC estimation).
Nevertheless many applied papers questioned the assumption of a constant unconditional variance structure. For instance Sensier and van Dijk (2004) found that most of the 214 U.S. macroeconomic variables they investigated exhibit a break in their unconditional variance. Ramey and Vine (2006) highlighted a declining variance of the U.S. automobile production and sales. Mc-Connell and Perez-Quiros (2000) documented a break in variance in the U.S. GDP growth and pointed out that neglecting non constant variance can be misleading for the data analysis. It emerges from these studies that processes with non constant unconditional variance are common features in practice. All these observations led us to consider the case of instantaneous causality relationships where the unconditional variance of the structural innovations changes over time.
Numerous tools for time series analysis in presence of non constant variance have been proposed in the literature. For instance Tsay (1988), Horváth, Kokoszka and Zhang (2006) or Sanso, Arago and Carrion (2004) proposed tests for detecting unconditional variance changes in several situations. Kokoszka and Leipus (2000) and Dahlhaus and Rao (2006) studied ARCH processes with non constant unconditional variance. Robinson (1987), Hansen (1995), Francq and Gautier (2004) or Xu and Phillips (2008) among other references investigated univariate linear models allowing for a non constant variance. Stȃricȃ (2003) considered a deterministic non constant specification for the unconditional variance of stock returns, and noted that such an approach can perform as well as the stationary GARCH(1,1) model. Kim and Park (2010) studied cointegrated systems with non constant variance. Bai (2000), Qu and Perron (2007), and  among others investigated the estima-tion of multivariate models with time-varying variance. Aue, Hörmann, Horvàth and Reimherr (2009) proposed a test procedure for detecting variance breaks in multivariate time series.
In this paper we focus on the test of zero restriction on the time-varying variance structure. We highlight that the standard Wald test for instantaneous causality implemented in the commonly used softwares does not provide suitable critical values when the variance structure is time-varying. It is also established that the tests based on White or HAC corrections of the Wald test statistic can suffer from a severe loss power in certain important situations. More precisely these tests may be unable to detect some important alternatives as periodic changes or when the covariance structure is close to zero, so that its sign likely changes. Noting that the previous tests are not intended to handle data with non constant unconditional variance, a new approach for testing instantaneous causality taking into account non-stationary unconditional variance is proposed in this paper. It is however found that the asymptotic distribution of the modified statistic is non standard involving the unknown non constant variance structure in a functional form. When the asymptotic distribution is non standard, the wild bootstrap method is widely used in the literature for the analysis of time series possibly displaying (unconditional) heteroscedasticity/dependence (see e.g. Kilian (2004), Horowitz, Lobato, Nankervis andSavin (2006) or Inoue and Kilian (2002)). Therefore a wild bootstrap procedure is provided for testing zero restrictions on the non constant variance structure. It is established through theoretical and empirical results that the modified test is preferable to the tests based on the spurious assumption of constant unconditional variance.
The plan of the paper is as follows. In the next section, we introduce the VAR models with non constant variance. In section 3 the testing problem for instantaneous causality between subvectors of a VAR process with non constant variance is discussed. The asymptotic properties of the tests based on the assumption of constant unconditional variance are presented. It emerges from this part that this kind of tests should be avoided in our non standard framework. As a consequence a test based on the wild bootstrap procedure taking into consideration non constant variance is built. The finite sample properties of the tests are investigated in Section 4 by Monte Carlo experiments. We also consider US macroeconomic data to illustrate our findings. In section 5, we draw up a conclusion on our results.

Vector autoregressive model with non constant variance
Consider the following VAR model where X t ∈ R d and it is assumed that X −p+1 , . . . , X 0 , X 1 , . . . , X T are observed. The d × d dimensional matrices A 0i are such that det A(z) = 0 for all |z| ≤ 1, where should be formally written in a triangular form, but the double subscript is suppressed for notational simplicity. In the following assumption we give the structure of the variance by using the rescaling approach of Dahlhaus (1997). F t corresponds to the σ-field generated by { k : k ≤ t} and . r is such that x r := (E x r ) 1/r for a random variable x with . the Euclidean norm. (ii) The process ( t ) is α-mixing and such that E( t | F t−1 ) = 0, E( t t | F t−1 ) = I d and sup t t 4µ < ∞ for some µ > 1.
If we suppose the process ( t ) Gaussian and that the functions g kl (.) are constant we retrieve the standard case. Nevertheless when the unconditional variance is time-varying, it can be expected that the tools developed in the stationary framework are not valid or can suffer from drawbacks since the tests for instantaneous causality are directly based on the variance structure. From the piecewise Lipschitz condition abrupt breaks as well as smooth changes are allowed for the unconditional variance. In particular the variance may have a periodic behaviour. The framework given by our assumption is similar to that of numerous papers in the literature and encompasses the case of piecewise constant variance structure (see Pesaran and Timmerman (2004), Hamori and Tokihisa (1997) or Xu and Phillips (2008) and references therein). However since we assumed that E( t t | F t−1 ) = I d , the error terms cannot display GARCH effects (as for instance second order correlation).
Therefore the tools proposed in this paper have to be preferably used for relatively low frequency variables for which it is commonly admitted that there is no second order dynamics (for instance monthly, quarterly or annual macroeconomic data, see Section 4.2 below). In such a situation adding a multivariate GARCH structure to our model as in Hafner and Linton (2010) can be viewed as too elaborated. The tests proposed in  can be used to check if there is no second order dynamics within the data. In the framework of A1 we are interested in testing zero restrictions on the variance structure Σ(r).
Now re-write model (2.1) as follows withX t−1 = (X t−1 , . . . , X t−p ) and Let X 1t and X 2t be the subvectors of X t := (X 1t , X 2t ) with respective dimensions d 1 and d 2 and let Σ 12 t be the d 1 × d 2 -dimensional upper right block of Σ t := E(u t u t ).
Our goal is to determine if it exists an instantaneous causality relation between X 1t and X 2t . The next lemma gives some preliminary results and requires to introduce some additional notations. Letû t := (û 1t ,û 2t ) ,θ t :=û 2t ⊗û 1t ,v t := vec(û 1tû 2t − , H t := (H 1t , H 2t ) and G(r) := (G 1 (r) , G 2 (r) ) be in line with the partition of X t . We denote by [z] the integer part of a real number z. We also denote by ⇒ the convergence in distribution and → the convergence in probability. and In addition we also have Using (2.3) and (2.4) we shall discuss the test for instantaneous causality between X 1t and X 2t assuming spuriously that the unconditional variance is constant and propose a new test adapted to our framework in the next section. Nevertheless remarks on the result (2.3) must be made.
Remark 2.1. Let be in line with the partition of X t . If we suppose that Σ 12 (r) = 0 for all r ∈ (0, 1], which corresponds to the case of no instantaneous causality relation between X 1t and X 2t (see the null hypothesis H 0 below), it follows thatv t =θ t =û 2t ⊗û 1t and In such a situation from Lemmas 5.1, 5.2, 5.3 and the proof of Lemma 2.1, it is clear thatΩ In particular when the process (u t ) is assumed Gaussian with non constant variance the expression of Ω simplifies itself into 1 0 Σ 22 (r) ⊗ Σ 11 (r)dr by using (5.4).
which corresponds to the standard case under the hypothesis H 0 below. In such a case the expression of Ω simplifies itself into Σ 22 u ⊗Σ 11 u . Under the strong assumption of i.i.d. Gaussian error process it can be shown that (2.6)

Testing for instantaneous causality
In the sequel we follow the notations of Lütkepohl (2005). Denote by X 2t (1|{X k |k < t}) the optimal one step linear predictor of X 2t at the date t − 1, based on the information of the past of the process (X t ). Similarly we define the one step linear predictor X 2t (1|{X k |k < t} ∪ {X 1t }) based on the past of (X t ) and the present of (X 1t ). It is said that there is no instantaneous linear causality between (X 1t ) and In the case of non constant variance following the assumption A1 and, more particularly, because we assumed that the H t 's are lower triangular nonsingular matrices with positive diagonal elements, it can be shown that there is no instantaneous causality between X 1t and X 2t if and only if the Σ 12 t 's are all equal to zero following the same arguments to those in Lütkepohl (2005, pp 46-47). Consequently in our non standard framework the following pair of hypotheses has to be tested: Now if we consider the case where the variance is assumed constant Σ t = Σ u for all t, it is well known that there is no instantaneous causality between X 2t and X 1t if and only if Σ 12 u = 0 with obvious notation. Therefore the following pair of hypotheses is tested under standard assumptions: The block Σ 12 u is usually estimated by T −1 T t=1û 1tû 2t which converges in probability to 1 0 Σ 12 (r)dr under A1. Hence such hypothesis testing does not take into account the time-varying variance in the sense that it can only be interpreted as a global zero restriction testing of the covariance structure, i.e. testing 1 0 Σ 12 (r)dr = 0 against the alternative 1 0 Σ 12 (r)dr = 0. Then H 0 and H 1 are inappropriate for testing instantaneous causality in our non standard framework.
It is interesting to point out that H 0 is a particular case of H 0 , i.e. H 0 ⊂ H 0 , since H 0 corresponds to 1 0 Σ 12 (r)dr = 0. On the other hand since 1 0 Σ 12 (r)dr = 0 implies that Σ 12 (r) = 0, then It is shown in the next part that the case H 1 ∩ H 0 entails a loss of power for tests built on the assumption of constant unconditional variance of the innovations.

Tests based on the assumption of constant error variance
In this section the consequences of non constant variance on the instantaneous causality tests based on the spurious assumption of a stationary process are analyzed. Let be δ T := T − 1 2 T t=1θ t where we recall thatθ t =û 2t ⊗û 1t . The standard test statistic is given by If the practitioner (spuriously) assumes that the error process is iid but not Gaussian, u 1t and u 2t could be dependent and the following statistic with White type correction should be used: where the weight matrixΩ w is defined in (2.5). Recall thatΩ w is a consistent estimator of Ω under H 0 and it is clear from the proof of Lemma 2.1 that Finally the practitioner may again (spuriously) suppose that the error process is stationary and that the observed heteroscedasticity is a consequence of the presence of nonlinearities. However note that the assumed heteroscedasticity is only conditional while the unconditional variance is still constant in this case. This kind of situation can arise if we (spuriously) assume that the innovations process is driven by a GARCH model or any other model displaying nonlinearities such as models driven by hidden Markov chains or All-Pass models (see Amendola and Francq (2009)).
Introduceẑ m,t the residuals of such a regression andΩ The order m can be chosen by using an information criterion. The following statistic involving VARHAC type weight matrix may be used Since we assumed that the autoregressive order p is well adjusted (or known), the process ϑ t = u 2t ⊗ u 1t is uncorrelated and it can be shown that the A m,k 's converge to zero in probability. ThereforeΩ h → Ω under H 0 and under H 1 , so thatΩ h andΩ w are asymptotically equivalent in the framework of A1.
This is not surprising since second order dynamics are in fact excluded in A1.
In this part the asymptotic properties of the above statistics are investigated.
The asymptotic behavior of the statistics in our non standard framework is first established under H 0 . The results are direct consequences of (2.3) where the Z j 's are independent N (0, 1) variables, and λ 1 , . . . , λ d 1 d 2 are the eigenvalues of the matrix Ω Σ 12 (r)dr = 0. It follows that the S st , S w and S h statistics grow to infinity as fast as T . Therefore we can expect that the tests based on the assumption of constant variance will detect a possible instantaneous causality for large enough sample sizes when H 1 ∩ H 1 hold.
The abilities of the W st , W w and W h tests to detect the case 1 0 Σ 12 (r)dr = 0 are compared considering the approximate Bahadur slope approach (Bahadur (1960)).
For the test based on the S st statistic define q st (x) = − log P 0 (S st > x) for any x > 0, where P 0 stands for the limit distribution of S st under Σ 12 (r) = 0.
For a fixed alternative such that = Finally we study the ability of the W st , W w and W h tests in detecting instantaneous causality when H 1 ∩ H 0 hold, that is Σ 12 (r) = 0 and 1 0 Σ 12 (r)dr = 0.
In this caseΩ h = O p (1) andΩ w = O p (1) from (3.1) and (3.2) and we also havê When such eventuality is considered, it is clear that the tests based on the assumption of stationary errors may suffer from a severe loss of power. This is a consequence of the fact that this kind of tests are not intended to take into account time varying variance. The case H 1 ∩ H 0 can arise in the important case where Σ 12 (r) = 0 but close to zero so that Σ 12 (r) may have a changing sign. Even when at least one of the components of Σ 12 (r) is far from zero, we can have 1 0 Σ 12 (r)dr = 0 as for instance in some cases where the variance structure is periodic. This can be seen by considering the bivariate case and taking Σ 12 (r) = c cos(πr) or Σ 12 (r) = c1 [0, 1 2 ] (r) − c1 ] 1 2 ,1] (r) with c ∈ R. Therefore the tests based on the spurious assumption of constant unconditional variance for the error must be avoided.
In summary it is found that the tests based on the White and VARHAC corrections should control well the type I errors for large enough samples on the contrary to the W st . In addition it appears that in the case of non constant unconditional variance and when H 1 ∩ H 1 hold, the W h and W w tests have better power properties than the W st test. Therefore the W h and W w tests should be preferred to the W st test when the unconditional variance is time-varying. However it is also found that the tests based on the assumption of constant variance may suffer from a severe loss of power in the important cases where 1 0 Σ 12 (r)dr = 0 (or 1 0 Σ 12 (r)dr ≈ 0). A bootstrap test circumventing this power problem in the case H 1 ∩ H 0 is proposed in the next part.

A bootstrap test taking into account non constant variance
t=1θ t with s ∈ [0, 1] and consider the following statistic: Under H 0 and from (2.4) we write: where the covariance matrix becomesΩ = M = E( t t ⊗ t t ). Therefore under H 0 we have from the Continuous Mapping Theorem vec(Σ 12 t ) (3.6) withΩ defined in (2.4). The first term in the right hand side of (3.6) converges to zero in probability, while we have T −1 [T s] t=1 vec(Σ 12 t ) = s 0 vec(Σ 12 (r))dr + o(1) and sup s∈[0,1] || s 0 vec(Σ 12 (r))dr || 2 2 = C > 0. Hence we have in such a situation From (3.5) we see that the asymptotic distribution of S b under the null H 0 is non standard and depends on the unknown variance structure and the fourth order cumulants of the process ( t ) in a functional form. Thus the statistic S b cannot directly be used to build a test and we consider a wild bootstrap procedure to provide reliable quantiles for testing the instantaneous causality. In the literature such pro-cedures were used for investigating VAR model specification as in Inoue and Kilian (2002) among others. The reader is referred to Davidson and Flachaire (2008) or Gonçalves andKilian (2004, 2007) and references therein for the wild bootstrap procedure method. For resampling our test statistic we draw B bootstrap sets given by where we denote by ⇒ P the weak convergence in probability. A proof of (3.7) is provided in the Appendix. Note that we have by construction E * (ξ (i) tθt ) = 0 even when the alternative is true, that is E(ϑ t ) = 0 (recall that ϑ t = u 2t ⊗ u 1t ). As a consequence the result (3.7) is hold whatever Σ(r) 12 = 0 or Σ(r) 12 = 0.
The W b test consists in rejecting H 0 if the statistic S b exceeds the (1−α) quantile of the bootstrap distribution. Under H 1 with 1 0 Σ(r) 12 dr = 0 we note that all the statistics considered in this paper increase at the rate T . However when Σ(r) 12 = 0 with 1 0 Σ(r) 12 dr ≈ 0, we can expect that the W b test is more powerful than the tests based on the assumption of constant unconditional variance. In such situations

Simulation study
For our experiments we simulated simple bivariate VAR(1) processes where the autoregressive parameters are inspired from those estimated from the money supply and inflation in the U.S. data (see section below). The data generating process can be written as where the innovations are Gaussian with variance structure Σ(r) respecting the assumption A1. Two cases are considered for this structure : • Case 1: Empirical size setting. It does not exist an instantaneous causality relation between X 1,t and X 2,t : where Σ 11 (r) = a − cos(br) and Σ 22 (r) = a + sin(br) correspond to the non constant variances of the innovations. We take a > 1 which represents the level of these variances and b their angular frequency.
• Case 2: Empirical power setting. It exists an instantaneous causality relation between X 1,t and X 2,t : where Σ 12 (r) = c sin(2πr) respects the case 1 0 Σ 12 (r)dr = 0 with Σ 12 (r) = 0 almost everywhere on r ∈ (0, 1], and Σ 11 (.), Σ 22 (.) are defined as in Case 1. In particular, the constant c will allow to investigate the ability of our modified test for detecting such alternative when it gets closer to the null hypothesis.  Table 1. On the other hand processes generated by Case 2 are considered for the power study. The results are given in Table 2 and Figure 1. Note that in Table 1 and Table 2 we take a = 1.1, b = 11 and c = 0.5 while in Figure 1 we take several values for c and a = 1.1, b = 11.
In our example the W st , W w and W b tests seem to control the type I errors reasonably well (see Table 1). We can remark that the standard test provides similar results as compared to the other tests. Nevertheless this outcome does not have to be generalized in view of (3.3). In addition recall from Proposition 2 that the W st is less powerful than the W w and W h tests. Now if we turn to the alternative given by Case 2, Table 2 clearly shows that the W st and W w tests have no power as the sample sizes increase on the contrary of the W b test. This confirms the theoretical results obtained when 1 0 Σ 12 (r)dr ≈ 0. For instance the W b test is almost always rejecting the null hypothesis H 0 for a sample size of 1000, while the W st and W w tests are completely not able to detect the alternative in this case.
In the above power experiments the changes of Σ 12 (r) around zero were fixed by a constant c. In this part we illustrate the ability of the tests to detect departures from the null hypothesis Σ 12 (r) = 0, while we again have 1 0 Σ 12 (r)dr = 0 in all situations. Figure 1 represents the power of the three tests when the parameter c takes several values, while the sample is fixed T = 500. We clearly observe that the relative rejection frequencies of the W b test increases when the covariance structure Σ 12 (r) = 0 goes away from zero but verifying 1 0 Σ 12 (r)dr = 0. On the other hand we again remark that the relative rejection frequencies of the tests based on the assumption of constant variance remain close to the asymptotic nominal level even when c takes large values.

Application to macroeconomic data sets
In this part we compare the W st and W w tests with the W b test by investigating instantaneous causality relationships in U.S. macroeconomic data sets. stance, the quantity theory of money assumes a proportional relationship between money supply and the price level. The reader is referred to Case, Fair and Oster (2011) or Mankiw and Taylor (2006) concerning the theoretical links which can be made between money supply and inflation. Many studies investigate this relation from an empirical point of view. Their results are however ambiguous. For instance, Turnovsky and Wohar (1984) used a simple macro model to investigate the relationship and find that the rate of inflation is independent of the monetary growth rate in the U.S.A. over the period 1923-1960, while Benderly and Zwick (1985 or Jones and Uri (1986) give some evidence of relationship over the respective periods 1955-1982 and 1953-1984 The first differences of the data are considered in the sequel. From Figure 2 it appears that the obtained series have non constant variance. We adjusted a VAR(1) model to the first differences of the series. The autoregressive order is chosen by using portmanteau tests adapted to our non standard framework where the variance structure Σ(r) is time-varying (see Patilea and Raïssi (2011) for details).
The outcomes in Table 3 suggest that the model is well fitted. The estimation of the model by the OLS method is given in Table 4. The residuals of this estimation are next recovered to implement the tests studied in this paper. Note that we used 399 bootstrap iterations for the W b test.
From  Figure 3 shows that Σ 12 (r) seems not null over the considered period while its seems that for the two data sets. The estimator is defined as in  which showed that such estimator is consistent under A1 unless at the break points.     Similarly to the first data set, we consider the first differences of the data (see   E v [T r] v [T r] = (G 2 (r) ⊗ G 1 (r))M(G 2 (r) ⊗ G 1 (r)) − vec(Σ 12 (r))vec(Σ 12 (r)) , and lim T →∞ E ϑ [T r] = vec(Σ 12 (r)) for values r ∈ (0, 1] at which the functions g ij (r) are continuous.
Proof of (3.7) For the sake of simplicity and with no loss of generality (see (5.3)) let us assume that X t = u t , so that the error process is observed and there is no autoregressive parameters to estimate. Conditionally on the u t 's, δ s is a Gaussian process with independent increments and variance where E * (.) is the expectation under the bootstrap probability measure. The result follows if ϑ t ϑ t → s 0 (G 2 (r) ⊗ G 1 (r))M(G 2 (r) ⊗ G 1 (r)) dr, uniformly for all s ∈ [0, 1]. Since T −1 [T s] t=1 ϑ t ϑ t is monotonically increasing and the limit function is continuous, it suffices to establish the pointwise convergence following Hansen (2000, proof of Lemma A.10). This holds using similar arguments as for (5.1) and (5.2), see  Lemmas 7.3 and 7.4.