Estimation and Inference on Time-Varying FAVAR Models

ABSTRACT We introduce a time-varying (TV) factor-augmented vector autoregressive (FAVAR) model to capture the TV behavior in the factor loadings and the VAR coefficients. To consistently estimate the TV parameters, we first obtain the unobserved common factors via the local principal component analysis (PCA) and then estimate the TV-FAVAR model via a local smoothing approach. The limiting distribution of the proposed estimators is established. To gauge possible sources of TV features in the FAVAR model, we propose three L2-distance-based test statistics and study their asymptotic properties under the null and local alternatives. Simulation studies demonstrate the excellent finite sample performance of the proposed estimators and tests. In an empirical application to the U.S. macroeconomic dataset, we document overwhelming evidence of structural changes in the FAVAR model and show that the TV-FAVAR model outperforms the conventional time-invariant FAVAR model in predicting certain key macroeconomic series.


Introduction
Factor-augmented vector autoregressive (FAVAR) models have drawn increasing attention in the macroeconomic literature.As pointed out by Sims (1992) and Christiano, Eichenbaum, and Evans (1999), there is a dilemma between incorporating sufficiently large information and controlling the degree of freedom of a standard VAR model, as the number of parameters increases rapidly with the number of variables.By introducing the unobservable common factors into the VAR structure, the FAVAR model can summarize large-dimensional information and achieve dimension reduction.Empirical studies show that one can improve economic prediction and better interpret the economic relationship by incorporating the latent common factors into regressions (e.g., Stock and Watson 2002).
The FAVAR models were initially proposed by Bernanke, Boivin, and Eliasz (2005) to identify the monetary transmission mechanism.Bai and Ng (2006) provide the asymptotic theory for factor-augmented regressions.Bai, Li, and Lu (2016) study the identification restrictions and proposed a likelihoodbased two-step approach to estimate the FAVAR model.The literature mentioned above establishes a substantial theoretical foundation for the FAVAR model.However, most of the existing literature on the FAVAR model assumes that both the factor loadings and the VAR coefficients are time-invariant.In fact, since the FAVAR model is widely used in macroeconomic analysis and the datasets usually have a long time span, it is unsuitable to assume that the factor loadings and the VAR coefficients are time-invariant.Driving forces such as economic transition and technological progress could signifi-CONTACT Xia Wang wxia@ruc.edu.cnSchool of Economics, Renmin University of China, Beijing, 100872, China.Supplementary materials for this article are available online.Please go to www.tandfonline.com/UBES.cantly influence the relationship among economic and financial variables.
Structural change has drawn much attention in the literature.Examples in the linear regression models include Bai and Perron (1998) and Qu and Perron (2007).Besides, there also exists an extensive literature on testing for structural changes in factor models; see, for example, Breitung and Eickmeier (2011), Chen, Dolado, and Gonzalo (2014), Corradi and Swanson (2014), Han and Inoue (2015), Su andWang (2017, 2020a), and Baltagi, Kao, and Wang (2021).Despite the vast literature on structural changes in factor models, little attention has been paid to the instability of FAVAR models.Eickmeier, Lemke, and Marcellino (2015) introduce a TV-FAVAR model by modeling the TV factor loadings and coefficients as random walk processes.Li, Tosasukul, and Zhang (2020) introduce a functionalcoefficient predictive model with latent factor regressors. Wei and Zhang (2020) propose a TV diffusion index model.Yan and Cheng (2022) also consider a factor-augmented predictive regression and introduce a threshold structure to capture parameter instability.Cai and Liu (2021) allow the coefficients of the FAVAR models to vary with certain state variables.Most of these papers assume the factor loadings are time-invariant and only impose instability on the VAR coefficients.Moreover, they only consider the consistency of estimation and do not establish the limiting distribution of the estimated coefficients.
In this article, we propose a novel TV-FAVAR model that allows the unknown factor loadings and VAR coefficients to change smoothly over time.We suggest a two-stage procedure to estimate the model.In the first stage, we estimate a TV factor model by the local principal component analysis (PCA) proposed by Su andWang (2017, 2020a) to obtain consistent estimates for the common factors.In the second stage, we augment a VAR model using the estimated common factors and then estimate the TV VAR coefficients by a local smoothing approach.We further establish the limiting distributions of the estimated VAR coefficients under the standard large N and large T setting.Besides, we propose three test statistics to gauge the possible sources of TV features in the FAVAR model.Let {X it } be the dataset to obtain the estimates of the common factors and factor loadings, and {Y t } be the vectors used in the VAR model.To construct our tests, we first estimate a timeinvariant FAVAR to collect the estimates for the time-invariant factor loadings and VAR coefficients and run two additional regressions using the local smoothing approach.In the first regression, we regress {X it } on the estimated common factors via local smoothing to obtain the estimates for TV factor loadings.
In the second regression, we replace the unobserved common factors with the estimated common factors and estimate the VAR coefficients via local smoothing.The test statistics are then constructed by measuring the L 2 -distances between these two sets of estimators.In an empirical application, we use the proposed TV-FAVAR model to check whether the U.S. economy suffers from structural changes and evaluate the forecasting performance of our approach for some key variables.We find strong evidence of structural changes in both the factor structure and the VAR dynamics and show that the TV-FAVAR model delivers a superior forecasting performance than Bai and Ng's (2006) conventional time-invariant FAVAR model.
The rest of this article is organized as follows.In Section 2, we introduce the TV-FAVAR model.In Section 3, we propose a twostage estimation procedure to estimate the TV-FAVAR model and study the asymptotic properties of the estimators.In Section 4, we construct three test statistics to test for the TV factor loadings and/or VAR coefficients and study the asymptotic null distributions and asymptotic local power properties.Section 5 studies the finite sample performance of our estimators and tests via simulations, and Section 6 provides empirical applications.Section 7 concludes.All proofs are contained in the supplementary materials.
Notation.For a real matrix A, we denote its transpose as A , its Frobenius norm as A ≡ [tr AA ] 1/2 , and its spectral norm as →", and "plim" denote convergence in probability, convergence in distribution, and probability limit, respectively.We use (N, T) → ∞ to denote that N and T pass to infinity jointly.Let C < ∞ denote a positive constant that may vary from case to case.

The Model
Let {X it , i ∈ [N]; t = −p + 1, . . ., T} be an N-dimensional time series with T + p observations, where p is the lag order of the FAVAR model.For notational simplicity, we assume that the total number of time series observations is T + p.We follow the lead of Su and Wang (2017) to assume that X it admits a TV factor structure with R common factors F t = (F 1t , . . ., F Rt ) : where {e it } can be weakly dependent across both the crosssectional unit i and time period t.Let {Y t , t = −p + 1, . . ., T} be a K-dimensional random vector with T + p observations.We assume that (Y t , F t ) is generated via the following TV VAR(p) process: where In particular, (2.2) implies that: 3) It can be regarded as an extension of Stock and Watson's (2002) diffusion index model by allowing for structural changes in the regression coefficient matrices.
To consistently estimate the proposed TV-FAVAR model, we follow Robinson (1989Robinson ( , 1991) ) to specify the TV factor loadings and the TV VAR coefficients as deterministic functions of the rescaled time index t/T: where λ i (•) and φ j (•) are unknown smooth functions on (0, 1] for each i and j.Such a specification is widely adopted by the nonparametric TV models; see Cai (2007), Chen and Hong (2012), and Su and Wang (2017), among many others.Model (2.1) is the factor model with TV factor loadings considered by Su andWang (2017, 2020b).As is well known, λ it and F t are not separately identifiable.Let t = (λ 1t , . . ., λ Nt ) and F = (F −p+1 , . . ., F T ) .We follow Bai andNg (2002, 2006) and Bai (2003) to impose the identification conditions that (T + p) −1 F F = I R and N −1 t t is diagonal with descending diagonal elements.In addition, Su and Wang (2017) have studied consistent determination of R based on local PCA estimates.Therefore, we assume that R is known in this article.
Remark 1.There is a growing literature on TV factor loadings that specifies the TV factor loadings as a random walk process or a VAR process; see Stock and Watson (2002), Banerjee, Marcellino, and Masten (2008), Del Negro and Otrok (2009), Bates et al. (2013), Eickmeier, Lemke, and Marcellino (2015), and Mikkelsen, Hillebrand, and Urga (2019).Similarly, Mumtaz and Surico (2012) consider TV coefficients in the factor process.Most of these papers estimate the unknown parameters using the Bayesian approach.The interpretations and implications behind the stochastic and deterministic specifications for the TV parameters are inherently distinct.Cogley and Sargent (2001) point out that the stochastic fluctuations in the parameters of a reduced-form economic system may result from the evolving beliefs of a policymaker.In contrast, the deterministic timevarying coefficients arise when structural changes exist.In this article, we aim to incorporate structural changes into the FAVAR framework.Hence, we specify the TV factor loadings and VAR coefficients as deterministic functions of t/T and apply the nonparametric kernel method to estimate the TV parameters.It is worth mentioning that the kernel method can also be used to estimate parameters associated with stochastic time variation; see Giraitis, Kapetanios, and Yates (2014) and Giraitis, Kapetanios, and Marcellino (2021) for the kernel estimation of TV coefficients under stochastic specifications without and with endogeneity, respectively.We note that most of the existing studies specify the TV coefficients as either a stochastic or deterministic form without any formal justification.We conjecture that it is possible to follow the lead of Fu et al. (2022) and propose some formal tests to distinguish these two forms in the factor literature.
Remark 2. The factor model in (2.1) is a TV version of the static factor model studied by Bai and Ng (2002) and Bai (2003), where the impact of a factor occurs only contemporaneously in all series.Alternatively, one can follow the lead of Barigozzi et al. (2021) and consider a TV generalized dynamic factor model (GDFM).That paper extends the GDFM of Forni et al. (2000Forni et al. ( , 2005Forni et al. ( , 2017) ) and proves the consistency of their estimators of the TV impulse response functions.Nevertheless, the estimation procedures for these two models are different, and it is quite challenging to derive a consistent estimator for the common shocks (primitive factors) in a TV GDFM.Hence, it is difficult to extend our estimation and testing procedure to the TV GDFM, and we leave it for future research.

Further Representations
Since F t is unobservable, we replace F t with a generic estimator Ft , which is consistent up to a TV rotation matrix H t = H (t/T).Then, (2.2) implies that Then, (2.2) can be written as (2.5) where  (2, 2) to be constant if one adopts a TV factor model.

and
where

and
Then, we have (2.6)where

Estimation
In this section, we propose a two-stage method to estimate the TV-FAVAR model and establish the estimators' asymptotic distributions.

Two-Stage Estimation Procedure
To consistently estimate the parameters in the TV-FAVAR model described by (2.1) and (2.2), we propose a two-stage estimation procedure.In the first stage, we estimate the TV factor model in (2.1) by the local PCA approach of Su and Wang (2017), while in the second stage, we replace the latent common factor F t in (2.2) with the estimator Ft obtained in the first stage and estimate the TV VAR coefficients via a local smoothing procedure.

Stage 1: Estimating the TV Factor Model
, where K : R → R + denotes a kernel function and h 1 ≡ h 1NT is a bandwidth.For each fixed r ∈ {−p + 1, . . ., T}, we construct a (T + p) × N matrix X (r) = (X (r)  1 , . . ., ) and a (T + p) × R matrix of kernel weighted common factors Su and Wang (2017), we can consistently estimate the TV factor loadings and the kernel weighted common factors by solving the following constrained minimization problem: min r r is diagonal with descending diagonal elements.
The estimated factor matrix, denoted by F(r) = ( F(r) −p+1 , . . ., F(r) T ) , is T + p times eigenvectors corresponding to the R largest eigenvalues of X (r) X (r) , arranged in descending order, and ˆ r = ( F(r) F(r) ) −1 F(r) X (r) = (T + p) −1 F(r) X (r) is the estimator of the corresponding TV factor loading matrix.We note that F(r) is only consistent for a rotational version of F (r) .To obtain a consistent estimator for the original common factor, we regress {X it , i = 1, . . ., N} on { λit , i = 1, . . ., N} for each t to get the updated estimator Ft = As shown by Su andWang (2017, 2020b), the estimated common factor Ft is a consistent estimator for the latent factor F t up to a TV rotation matrix H t ≡ H (t/T) .

Limiting Distribution
In this section, we establish the asymptotic distribution of ˆ r .Note that (2.1) is a purely TV factor model, and the asymptotic distributions of the estimated common factors and factor loadings have been explicitly analyzed by Su andWang (2017, 2020b).Let e t ≡ (e 1t , . . ., e Nt ) , e ≡ e −p+1 , . . ., e T , e We make the following assumptions.
for some σ > 0, and r and for all r; (iv) Assumption A.1 slightly strengthens Assumption A.1 in Su and Wang (2017).We require the existence of eightplus moments for e it and F t to facilitate the analysis of the specification tests in the next section.The condition on e sp in Assumption A.1(i) has been widely assumed in the literature; see, for example, Moon andWeidner (2015, 2017), Li, Qian, and Su (2016), and Lu and Su (2016).Assumption A.1(i) allows e it to be both unconditionally heteroscedastic over (i, t) and conditionally heteroscedastic given F t .Assumption A.1(ii) imposes some moment conditions on the common factors, which appears more restrictive than those in the existing literature (e.g., Eichler, Motta, and Von Sachs 2011;Motta et al. 2011;Barigozzi et al. 2021).In particular, we notice that Assumption 4 in Barigozzi et al. (2021) imposes the moment condition E |u jt | r * ≤ C for some r * > 4 on the common shock, which is weaker than our moment condition.Actually, the condition that max t E F t 8+σ ≤ C can be relaxed to max t E F t 4+σ ≤ C if one only concerns with the consistency of the estimation but not the testing under our weak dependence conditions below.If we follow some existing literature such as Motta et al. (2011) to impose serial independence on {F t }, our moment condition on {F t } can also be weakened to max t E F t 4 ≤ C. Assumption A.1(iii) imposes smoothness conditions on λ it explicitly.Assumption A.2 specifies conditions on the kernel function and bandwidths, the first two parts of which are the same as Assumption A.3(i) and (ii) in Su and Wang (2017).
Assumption A.3 relates to the VAR part of the TV-FAVAR model.Assumption A.3(i) requires that the TV VAR coefficients be element-wise third-order continuously differentiable.This condition can be weakened to the second-order continuous differentiability by strengthening the requirements on the bandwidth parameters from Th 7 2 → 0 to Th 5 2 → 0 in Assumption A.2(iii).Assumption A.3(ii) provides moment conditions on ε t and Z 0 t , and Assumption A.3(iii) assumes that the matrix ZZ,t is positive definite for each t.A.3(iv) requires that ε t satisfies a martingale-difference-type condition in the FAVAR model, which helps us to establish the asymptotic normality of the estimator ˆ t .Assumptions A.3 (iii) and (iv) are quite similar to the first part of Assumption E in Bai and Ng (2006).
To derive the limiting distribution of ˆ t , we add some notations.Let t in descending order and ϒ t being the corresponding (normalized) eigenvector matrix.
t,j and B F t,j being respectively defined in (S1.1) and (S1.2) in the supplementary materials.
From the definitions of B Y t,j and B F t,j in (S1.1) and (S1.2), we note that the asymptotic bias of ˆ t is fairly complicated.It mainly arises from the error in the first stage estimation of F t .Replacing F t by Ft in the FAVAR will bring in the estimation error Ft −H t F t , which is O P (N/ ln T) −1/2 + h 2 1 uniformly in t.Such an error term does not contribute to the asymptotic distribution of ˆ t under the condition that √ Th 2 ((N/ ln T) −1/2 + h 2 1 ) = o (1) , which is ensured by Assumption A.2(iii).Nevertheless, H t is random and depends on the whole sample {X it } and we have to replace it by its probability limit Q −1 t in the asymptotic analysis.When λ it is TV, so is Q t ≡ Q (t/T) .This explains why several terms associated with the first-or second-order derivatives of functions of Q t enter the asymptotic bias terms.In the special case where Q t is time-invariant such that it can be written as Q, B Y t,j and B F t,j can be simplified to: jt , and where It is slightly more complicated than the usual second-order bias of local constant functional coefficient estimation because the second moment matrices ZY,jt and ZF,jt typically depend on time t when φ jt is TV.B Y t,j and B F t,j contain terms related to the derivatives of unknown TV coefficients, which are hard to estimate consistently.Thus, we advocate using the under-smoothing bandwidth to eliminate the asymptotic bias effect and hence focus on the estimation of the asymptotic variance-covariance (VC) matrix.
Z⊗ε .By the proof of Theorem 3.1 in the supplementary materials, we have ŜZZ,t ≡ Z⊗ε + o P (1) .Then we can consistently estimate t using Obviously, the estimate ˆ t is the local version of White-Eicker estimate of the asymptotic variance and is robust to conditional heteroscedasticity.If we further assume conditional homoscedasticity so that E(ε t ε t |Z t ) = ε for any t, the estimate ˆ t can be simplified ε , where Remark 3 (Lag Order Selection).In the above estimation procedure, we have assumed that the lag order p is known.In practice, p is usually unknown and should be determined a priori.We consider the BIC-type information criterion (IC) for the FAVAR model:

Specification Testing
In this section, we propose tests to detect TV features in the FAVAR model.As mentioned above, it is inappropriate to test constancy of φ jt , j ∈ p , if the factor loadings λ it 's are TV.

Hypotheses of Interest
We first consider the following null hypothesis: for some λ i0 ∈ R R and φ j0 ∈ R (R+K)× (R+K) .The alternative hypothesis H (1) and for all λ i0 ∈ R R and φ j0 ∈ R (R+K)× (R+K) .Under H (1) 0 , both the VAR coefficients and the factor loadings are time-invariant.Then the model degenerates to the time-invariant FAVAR model considered by Bai and Ng (2006).Under H (1) A , either the VAR coefficients or the factor loadings can vary over time, so the conventional PCA and the least squares estimators of the FAVAR model parameters typically fail to be consistent.
When we reject H (1) 0 , it is of further interest to explore the source of rejection.We then test where H (2) 0 and H (3) 0 check for the time-invariance of the factor loadings and the VAR coefficients, respectively.The alternative hypotheses H (2) A and H (3) A are the negations of H (2) 0 and H (3) 0 , respectively.Superficially, we can allow the VAR coefficients to be TV under H (2) 0 and the factor loadings to be TV under (3) 0 is meaningful only when one fails to reject H (2)  0 .In this case, we can estimate the factor model by the conventional PCA and then test H (3) 0 .In the following analysis of the test statistic for H (3) 0 , we do not need to assume the factor loadings in (2.1) to be exactly time-invariant.Instead, it suffices to assume that they satisfy certain restrictions under the local alternatives.

Test Statistics
Under H (1) 0 , we can follow Stock and Watson (2002) and Bai and Ng (2006) to estimate a time-invariant FAVAR model.Let 0 ≡ (λ 10 , . . ., λ N0 ) .We estimate the following factor model: by solving the following minimization problem: min F, tr X − F 0 X − F 0 subject to (T + p) −1 F F = I R and 0 0 is diagonal with descending diagonal elements.Let λi0 and Ft be the PCA estimators of λ i0 and F t , respectively.Let F ≡ ( F−p+1 , . . ., FT ) and ˜ 0 ≡ ( λ10 , . . ., λN0 ) .As is well known, F equals to T + p times eigenvectors corresponding to the R largest eigenvalues of XX , and In the second stage, we regress Wt on Zt to obtain the estimator After obtaining the restricted estimators λi0 , Ft , and ˜ 0 , we consider the TV regressions: where e † it and U † t are the respective error terms in the above regressions that account for the estimation errors introduced by replacing F t with Ft .Clearly, (4.3) is a nonparametric time series regression that regresses X it on the estimated common factor Ft for each i.To motivate our test statistics, we consider three potential cases: 1.When H (1) 0 holds, any nonparametric consistent estimators for λ i (•) and † (•) in (4.3) should converge to the same probability limits as the restricted estimators λi0 and ˜ 0 , respectively.In contrast, when H (1) 0 fails so that the factor loadings or/and the VAR coefficients are TV, the nonparametric consistent estimators for λ i (•) or † (•) should deviate significantly from those for λi0 or ˜ 0 , respectively.It implies that we can construct a test statistic to test H (1) 0 based on the distance between the nonparametric estimators of λ i (•) and † (•) and the restricted estimators λi0 and ˜ 0 .

When H
(2) 0 holds, the nonparametric consistent estimator for λ i (•) in (4.3) should be close to the restricted estimator λi0 .However, if H (2) 0 is false, the probability limit of the nonparametric consistent estimator of λ i (•) should deviate from that of λi0 .It suggests that we can construct a test statistic to test H (2) 0 based on the distance between the nonparametric consistent estimator for λ i (•) and the restricted estimator λi0 .3. The VAR part of the FAVAR model requires consistent estimation for the latent common factors.When the factor model admits TV factor loadings, the estimated factors will contain the TV features of the factor loadings.As a result, the autoregressive coefficients in the VAR representation will become TV since one cannot consistently estimate φ jt but rather ψ jt .Thus, it is meaningless to test constancy of ψ jt if the factor loadings are TV.In fact, if the factor model in (2.1) suffers from structural changes, it is better to adopt the TV-FAVAR model no matter whether φ jt is TV or not.We hence recommend testing H (3) 0 only when we fail to reject H (2) 0 .
To construct our test statistics, we use the local constant (Nadaraya-Watson) estimators for λ i (t/T) and † (t/T) .To avoid the boundary problem, we follow Hong and Li (2005), Li and Racine (2007), and Su and Wang (2020a) to adopt the following boundary kernel: .
Note that k † h,tr coincides with k h,tr in the interior region but not in the boundary regions.The local constant estimators of λ i (t/T) and † (t/T) are respectively, given by λit = λi Zs W s , where h 1 and h 2 are the bandwidths.
We then test H (1) 0 by measuring the quadratic distance between ( λit , ˘ t ) and ( λi0 , ˜ 0 ), and test H (2) 0 (resp.H (3) 0 ) by measuring the quadratic distance between λit and λi0 (resp.˘ t and ˜ 0 ).That is, we define: Given that M2 and M3 shrink to zero at different rates under the respective null hypotheses, a simple summation of them does not generate a good statistic for testing H (1) 0 .As a result, we consider the following standardized test statistics: Here, the recentering and scaling factors are defined as follows: Remark 4. The above test statistics avoid the local PCA estimation.Alternatively, one can also construct the test statistics based on the local PCA estimates.First, we consider the test of H (2) 0 .If we estimate the model by the two-stage estimation procedure introduced in Section 3, the local PCA estimator Ft is only a consistent estimator for the latent factor F t up to H t in the presence of TV factor loadings.As a result, one cannot test H (2)  0 by direct comparison of the unrestricted estimate Ft and the restricted estimate Ft but can test H (2) 0 by direct comparison of the restricted and unrestricted of the common component (λ it F t ) as in Su and Wang (2017).This makes the derivation of the asymptotic distribution of the resulting test statistic more complicated than that of SM 2 .Similar remarks hold for testing H (1) 0 .Second, we consider the test of H (3) 0 .The VAR part of the FAVAR model requires consistent estimation for the latent common factors.Recall that the local PCA estimator Ft is only consistent for H t F t when the factor loadings are TV.This makes it impossible to test the constancy of VAR coefficients even if we use the local PCA estimator.In addition, the derivation of the asymptotic properties of the local PCA estimators is quite involved, not to mention that for the test statistics based on the local PCA estimators.For these reasons, we manage to avoid the use of local PCA estimates by considering the auxiliary regressions in (4.3).

Asymptotic Null Distributions
In this section, we study the asymptotic null distributions of SM 1 , SM 2 , and SM 3 under their respective null hypotheses.To proceed, we add the following assumption.2+δ) ≤ C for some δ > 0.Moreover, there exists a positive integer 2+δ) ≤ C for some δ > 0. In addition, there exists a positive integer the process e it , t = −p + 1, −p + 2, . . . is a martingale difference sequence (m.d.s.) with respect to F NT,t−1 , such that E(e t |F NT,t−1 ) = 0, where F NT,t−1 is the minimal σ -field generated by (F t , F t−1 , . . ., e t−1 , e t−2 , . ..); (iv) For each t, W 0 t and ε t are independent of the idiosyncratic errors e is for all i and s.Assumptions A.4(i) and A.4(ii) impose some weak dependence conditions on the process {e it , F t } and {Y t }, respectively.As Su and Wang (2020a) remark, with more complicated notation, one can allow different individual time series to have distinct mixing rates.Assumption A.4(iii) assumes that the process {e it } is an m.d.s. with respect to the filter {F NT,t } and it allows for cross-sectional dependence among the error terms.This assumption is essential for proving the asymptotic distribution of our test statistic under the null and local alternative hypotheses.It is possible to allow for both serial and cross-sectional dependence in {e it } .However, it will substantially complicate the asymptotic analysis and we are not sure how to estimate the asymptotic variance of our test statistics in this case.Assumption A.4(iv) imposes independence between the idiosyncratic errors e is and the regressors and error terms in the FAVAR model.This assumption is also adopted by Bai and Ng (2006).It is essential for the asymptotic independence between SM 2 and SM 3 and greatly facilitates the derivation of the asymptotic variance of SM 1 .
As mentioned above, the requirements on the bandwidth parameters h 1 and h 2 for the tests are different from those in Assumptions A.2(ii) and A.2(iii).Instead, we make the following assumption on h 1 and h 2 .Su and Wang (2020a) and it is necessary to study the asymptotic properties of SM 1 and SM 2 .Assumption A.5(ii) is needed to study SM 1 and SM 3 .It is weaker than that in Assumption A.2(iii) because we do not need to consider the nonparametric estimation of the TV-FAVAR model under the global alternative.
The following theorem provides the asymptotic null distributions of the test statistics.(1) 0 , the smoothness conditions in Assumption A.1(iii) and A.3(i) are automatically satisfied.The test statistics are based on the sample quadratic forms, which measure the squared distance between the local smoothing estimators and the global least squares estimators.All three tests are asymptotically pivotal and have a convenient asymptotic standard normal distribution under the corresponding null hypotheses.Since a large value of any statistic is in favor of the alternative, our tests are all onesided.

Asymptotic Local Power
To study the asymptotic local power properties of SM l , for l = 1, 2, and 3, we consider the following classes of local alternatives: where a 1NT → 0 as (N, T) → ∞, a 2T → 0 as T → ∞, and g 1i (t/T) and g 2 (t/T) are piecewise smooth functions with a finite number of discontinuity points.Obviously, a 1NT and a 2T control the speed at which the local alternatives converge to the null hypotheses.Since (R+K) , we impose the normalization restrictions: 1 0 g 1i (τ )dτ = 0 for all i ∈ [N], and 1 0 g 2 (τ )dτ = 0, which facilitates the analysis of the local power properties.
Assumption A.6. (i) For each i ∈ [N], g 1i (•) is piecewise continuous with a finite number of discontinuity points on (0, 1], and satisfies that max i∈ [N] is piecewise continuous with a finite number of discontinuity points on (0, 1] and satisfies that sup ), where g 1ir = g 1i (r/T) and g 2r = g 2 (r/T) .
Assumptions A.6(i) and A.6(ii) allow for both sudden breaks and smooth changes in the factor loadings and VAR coefficients under the local alternatives.Assumption A.6(iii) is similar to Assumption A.5(ii) in Su and Wang (2020a).
To state the next theorem, we add some notations.Let , where V NT and V are R × R diagonal matrices containing the R largest eigenvalues of (1/NT)XX and 1/2 F 1/2 in descending order, respectively, and ϒ is the The following theorem provides the asymptotic local power of SM l for l = 1, 2, 3.  and π Theorem 4.2 shows that SM 1 , SM 2 , and SM 3 have nontrivial asymptotic power against H (1) A (a 1NT ), and . We note that Assumption A.6 allows for a finite number of unknown discontinuity points in the factor loadings and VAR coefficients.As a result, our tests have power for smooth structural changes and abrupt structural breaks, with possibly unknown break dates and an unknown number of breaks.Moreover, our tests do not require trimming the boundary regions of the sample.Hence, our tests can detect structural changes near the beginning or the ending of the sample.In the proof of Theorem 4.2(iii), we maintain that H (2) A (a 1NT ) holds for the reason that the VAR part of the FAVAR model requires consistent estimation for the common factor.Another reason for doing so is that the asymptotic properties of the factor estimators in the TV factor model remain unknown under the global alternative H (2) A .
Remark 5.If the factor loadings are TV, it is inappropriate to set the VAR coefficient to be time-invariant due to the presence of TV rotation matrix.However, by the established relationship between φ jt and ψ jt , we have that ψ (1,1)   jt = φ (1,1)   jt no matter whether {λ it } are TV or not.Hence, we can test the null hypothesis even when H (2)  0 does not hold.The alternative hypothesis jt , and Following the proof of Theorem 4.2, we can show that under , where μ 3S are defined in (S3.1) in the supplementary materials.

Bootstrap Versions of the Tests
As is well known, nonparametric kernel-based tests can have severe size distortions in finite samples and are also sensitive to the choice of bandwidth.To overcome these problems, we propose a resampling procedure to improve the finite sample performance of our tests.
Since we allow for weak cross-sectional dependence (CD) among the error terms in the factor model, standard wild bootstraps do not work well in the presence of CD.Here, we propose a bootstrap procedure that is robust to the presence of CD in {e it } .Let ˜ 0 e ≡ T −1 T t=1 ẽt ẽ t , and denote its (i, j)th element as σ 0 e,ij .To generate the bootstrap errors {e * t } that share the variancecovariance structure as {e t } asymptotically, we follow Fan, Liao, and Mincheva (2013) to obtain a consistent estimate of e in terms of spectral norm.
, and C 0 is a positive constant.In this article, we let C 0 = 1 initially.If C 0 = 1 can not deliver a positive definite matrix ˜ e , we choose C 0 to be the smallest value such that ˜ e is positive definite with a grid search approach.In most situations, ˜ e is positive definite when C 0 = 1.In addition, let ˜ U ≡ T −1 T t=1 Ũt Ũ t , and denote its (i, j)th element as σU,ij .By constructions, ˜ e and ˜ U are symmetric and positive semidefinite.The detailed bootstrap procedure is given as follows: 1. Estimate the restricted model X it = λ i0 F t + e it by the conventional PCA to obtain the restricted estimates { Ft } and { λi0 } in the first stage, and estimate Wt = 0 Zt +U † t to obtain the least squares estimate ˜ 0 in the second stage.Obtain the nonparametric kernel estimates { λit } and { ˇ t }, and then compute SM 1 , SM 2 , and SM 3 , respectively.2. For each t ∈ [T], generate an N × 1 vector ϑ t = (ϑ 1t , . . ., ϑ Nt ) and a (R + K) × 1 vector υ t = υ 1t , . . ., υ (K+R)t , where and ϑ t is independent of υ t .We construct the bootstrap errors e * t = ˜ 1/2 e ϑ t and The key in the above bootstrap procedure is Step 2, in which we generate the bootstrap sample X * it as in Su and Wang (2017) and apply the fixed-regressor wild bootstrap to generate the bootstrap sample { W * t }.The latter is inspired by the fixedregressor bootstrap procedure of Hansen (2000) who showed that there is no need to mimic the dynamic feature of a time series process in the bootstrap world.
The following theorem establishes the asymptotic validity of the above bootstrap procedure.

Theorem 4.3 (Asymptotic validity of the bootstrap procedure).
Let Assumptions A.1, A.2(i), and A.3-A.6 hold.Suppose that (i) there exists some γ 0 ∈ [0, 1) such that max 1≤i≤N |σ e,ij | γ 0 ≤ C, is the key condition to ensure ˜ e − e sp = o p (1) by following the analysis of Fan, Liao, and Mincheva (2013).The other side conditions are also imposed in Su and Wang (2020a) to facilitate the proof.Theorem 4.3 shows that the proposed bootstrap procedure provides an asymptotic valid approximation to the asymptotic null distributions of SM l for l = 1, 2, 3, no matter whether the respective null hypotheses are satisfied or not.This implies the above bootstrap tests have the correct asymptotic size under their respective null hypotheses.While under the alternative hypotheses, we have that the SM l 's diverge to infinity in probability.As a result, the bootstrap tests are consistent against TV parameters in the FAVAR model.

Monte Carlo Study
Here we study the finite sample performance of our estimators and tests via simulations.

Data Generating Process
We consider the following FAVAR(1) model: where we set the dimensions of Y t and F t to be K = 1 and R = 2, respectively.Denote as the logistic function with tuning parameter κ and location parameter γ = (γ 1 , . . ., γ L ) .We consider the following setups for the factor loadings λ it = (λ it,1 , λ it,2 ) and the VAR coefficients φ t .
DGP 5: (smooth changes in factor loadings and an abrupt break in VAR coefficients) λ it is the same as in DGP 4, and φ t is the same as that in DGP 3.

∼
iidU(0, 0.3).DGP 8: (Multiple abrupt breaks in both factor loadings and VAR coefficients) iidN(1, 1), and φ t is the same as in DGP 7.These DGPs describe various TV patterns in the factor loadings and VAR coefficients.DGPs 1 to 3 depict FAVAR models with time-invariant factor loadings and various types of VAR coefficients, that is, the time-invariant VAR coefficients, VAR coefficients with smooth changes, and VAR coefficients with an abrupt structural break, respectively.DGPs 4 to 7 are FAVAR models with smooth-changing factor loadings and various types of TV VAR coefficients.Among them, DGPs 4 and 6 examine monotonic and non-monotonic smooth-changing VAR coefficients, and DGPs 5 and 7 consider VAR coefficients with a single abrupt structural break and multiple abrupt structural breaks, respectively.DGP 8 captures a FAVAR model with multiple structural breaks in both factor loadings and VAR coefficients.

Estimation Results
In this section, we evaluate the performance of the proposed estimators for the VAR coefficients.We compare our estimators with those proposed by Bai and Ng (2006) jt,l ] 2 , where M is the number of replications, and p is the lag order.In addition, we also evaluate the estimators using the sum of squared residuals (SSR) and the RMSE of the variance estimate defined as follows: SSR , where {ε (l)  1t } t∈ [T],l∈ [M] is the residual corresponding to ε 1t at the lth replication.We define the SSR as the average of the squared residuals, which can be regarded as an estimator for the variance of the error term.Recall that ε 1t ∼ iidN(0, 1).Hence, the closer of the SSR is to 1, the better the estimation result is.RMSE σ measures the RMSE of the estimated variance of the error term.
We set the lag order p to be the true value.We also assess the performance of our BIC-type information criterion given by (3.1) with ρ T = log(T)/T.The results show that our IC works fairly well for all the DGPs under investigation.To save space, we relegate the detailed results to the supplementary materials.
Table 1 reports the results for the VAR estimators of Bai and Ng (2006) and ours with iid error terms based on 1000 replications.To save space, we only report the results of RMSE and RMSE σ here and relegate the results of SSR to the supplementary materials.As shown in the table, the RMSEs of our estimators generally decline as T increases.We note that the convergence rate of the VAR coefficients relies on T rather than N. Hence, it is reasonable that the RMSEs may not decline as N increases.Bai and Ng's (2006) estimator outperforms our esti- mator under DGP 1, which describes a time-invariant FAVAR model.However, it is not as good as our estimator under the other DGPs.In particular, the RMSEs of Bai and Ng's (2006) estimators generally do not decrease as T grows, indicating that their estimators are inconsistent due to the ignorance of the TV features in the factor loadings and/or VAR coefficients.We also note that the factor loadings and/or VAR coefficients exhibit abrupt structural breaks under DGPs 2, 5, 7, and 8.Although the theoretical result given by Theorem 3.1 is only applicable to the smoothing changes in factor loadings and/or VAR coefficients, the simulation results show that our estimator still outperforms Bai and Ng's (2006) when an underlying DGP admits abrupt structural breaks.We extend Stock and Watson's (2009) dataset on the U.S. macroeconomic variables.By excluding some discontinuous series, we get N = 101 quarterly time series spanning 1960:I to 2019:IV.Note that the first two quarters are discarded when calculating the first and second-order differencing.We get a total of T = 238 quarterly observations.All the series have been standardized to have zero mean and unit variance.We use them to extract the common factors.For more details on data description, one can refer to Stock and Watson (2009) and Su and Wang (2017).For the VAR part, we focus on the following seven key economic variables: the real GDP index (RGDP), personal consumption expenditures (PCEC), industrial production index (IP), GDP implicit price deflator (GDPDEF), total unit labor cost for manufacturing (LCM), unemployment rate (UR), and Federal funds effective rate (FedR).These variables cover various aspects of the macroeconomic fundamentals, including economic condition (RGDP, RCEC, IP), price level (GDPDEF), labor market (LCM, UR), and monetary policy (FedR).All these variables are transformed as suggested by Stock and Watson (2009).
We use each of these seven variables combined with the estimated common factors to construct the FAVAR model.We first determine the number of common factors and the lag order of the FAVAR model.The maximum number of common factors is set to be 9, while the maximum lag order is set to be 4.Other settings, including the kernel function and bandwidths, are the same as in the simulation studies.Su and Wang's (2017) local information criterion IC h2 chooses three common factors.We also adopt Bai and Ng's (2002) information criteria PC p1 , PC p2 , IC p1 , and IC p2 .The estimated number of factors by PC p1 is 7, while the other three information criteria choose 6 common factors.It is consistent with the fact that when the factor loadings have structural changes, Bai and Ng's (2002) information criteria tend to overestimate the number of common factors.Since Su and Wang's (2017) local information criterion is valid when the factor loadings suffer from structural changes, we pick the number of common factors as suggested by Su and Wang (2017).According to the IC given by (3.1), the optimal lag order is 1 for all these targeted variables.See the supplementary materials for details.
Using the constructed dataset, we first test structural change in the FAVAR(1) model.Table 4 reports the p-values based on 1000 bootstrap resamples for the seven key economic variables.We note that the joint test SM 1 rejects the null hypothesis at the 5% significance level for all variables except the GDP deflator.However, it becomes significant at the 10% significance level.We further explore the sources of rejection.The test SM 2 significantly rejects the constant parameter factor models for all seven variables under investigation, indicating that the source of rejection for SM 1 is the TV factor structure.As mentioned above, if the factor loadings exhibit structural changes, it is inappropriate to assume the VAR coefficients to be constant.NOTE: (i) The entries under "BN06" and "TV-FAVAR" report the out-of-sample MSFEs for Bai and Ng's (2006, BN06) and the proposed TV-FAVAR models, respectively.The bold entries highlight the better performance in each case.(ii) The entries under 'Ratio' report the ratios of the MSFEs of the local smoothing estimation to those of Bai and Ng's (2006) estimation.(iii) "z-test" and "p-value" denote Diebold and Mariano's (1995) z-statistics and the corresponding p-values.
Thus, it implies that we should adopt the TV-FAVAR model regardless of the test SM 3 , which is meaningless under TV factor loadings.However, the test SW 3S is still informative.It shows that the top left part of the VAR coefficients is TV for PCEC, GDPDEF, and FedR at the 10% significance level.
For each of the seven target series, we compare the one-stepahead out-of-sample forecasting performance of our TV-FAVAR model with Bai and Ng's (2006) diffusion index model.We use the sample before 2010:I to estimate the model and conduct one-step-ahead out-of-sample forecasting recursively.In addition, we use the test for predictive ability proposed by Diebold and Mariano (1995, DM) to check the statistical significance of differences in MSFE.Please see the supplementary materials S5.3 for a detailed discussion of the predictive ability tests.We admit that a better way is to account for the estimation error as in West (1996), but one needs to extend the latter test to the nonparametric framework first.We leave this for future research.Table 5 reports the MSFE for the seven target series of interest.We observe that our TV-FAVAR model outperforms Bai and Ng's (2006) time-invariant FAVAR model for five out of the seven target series, and the DM's z-statistics are significant for two of the five series.

Conclusion
FAVAR models have been widely used in macroeconomic analysis and have drawn great attention in the literature.The conventional FAVAR model assumes that both the factor loadings and the VAR coefficients are time-invariant over a long time span, which is quite restrictive and unrealistic.In this article, we introduce a TV-FAVAR where both the factor loadings and the VAR coefficients are allowed to change smoothly over time.We propose a two-stage procedure to consistently estimate the TV factor loadings and VAR coefficients and establish the estimators' limiting distributions under the standard large N and large T framework.In addition, we propose three test statistics to gauge the possible sources of TV behavior in the FAVAR model.These test statistics are constructed by measuring the squared L 2 -distances between two sets of estimators.Monte Carlo studies demonstrate that our estimators and tests perform well.In an application to the U.S. macroeconomic dataset, we find overwhelming evidence of structural changes in the FAVAR model and show that our TV-FAVAR model outperforms the timeinvariant FAVAR models in predicting several macroeconomic time series of interest.
and μ min (•) denote the sth largest and the smallest eigenvalues of a real symmetric matrix, respectively.For an m × n functional matrixA(τ ) = [A ij (τ )] i=1,...,m;j=1,...,n , we denote d c A(τ ) ≡ [d c A ij (τ )/(dτ ) c ]as the cth order element-wise derivative of A(τ ).We use B > 0 to denote that B is positive definite.Let P A ≡ A A A + A and M A ≡ I m − P A , where I m denotes an m × m identity matrix, and B + denotes the generalized inverse of a square matrix B. For a positive integer N, we let [N] ≡ {1, 2, . . ., N}.For a positive number a, a denotes the integer part of a.The operators " P →", " d (a,b) jt for (a, b) = (1, 2), (2, 1), and and ρ T T → ∞.When ρ T = log(T)/T, the above IC reduces to the commonly used BIC one.Let p = arg min p IC(p).It is standard to show that p consistently estimates p.

N→
i=1 |σ e,ij | γ 0 ≤ C, (ii) T −1 denotes weak convergence under the bootstrap probability measure conditional on the observed sample{X it , Y t } i∈[N], t∈[T] .The first side condition in Theorem 4.3, viz., max 1≤i≤N N i=1 . As mentioned above, the estimated VAR coefficients ψ(a,b) jt are not comparable with the true values φ (a,b) jt except for (a, b) = (1, 1).Hence, we evaluate the accuracy of our estimators for the (1, 1)st element of the VAR coefficients φ jt using the root mean squared error ( T}. Let W t ≡ Y t , F t H t and Ŵt = (Y t , F t ) , both of which are (K + R) × 1.Let Z t ≡ (W t−1 , . .., W t−p )and Ẑt ≡ ( Ŵ t−1 , . . ., Ŵ t−p ) be (K + R)p × 1 vectors and let t ≡ (ψ 1t , . . ., ψ pt ) be a (K + R)p × (K + R) matrix.For r ∈ [T], we consider the local least squares there is no problem of allowing for the TV coefficients in the VAR model when testing H conduct the conventional time-invariant PCA to obtain the bootstrap versions { F * t , λ * i0 } of { Ft , λi0 }; run the restricted model W * t = 0 Zt + U † t to obtain the bootstrap version estimates ˜ * 0 ; run X * it on F * t to obtain the local constant estimate λ * it with the same kernel and bandwidth as used to obtain λit ; and run W * t on Zt to obtain the local constant estimate ˘ * t with the same kernel and bandwidth as used to obtain ˘ t .Calculate the bootstrap test statistics SM * l , the bootstrap versions of SM l for l ∈ [3]. 4. Repeat Steps 2 and 3 for B times and index the bootstrap test statistics as { SM * l,b } B b=1 for l ∈ [3] .The bootstrapped p-values are calculated by p * l

Table 1 .
Performance of estimation in terms of RMSE and RMSE σ .The main entries report the values of RMSE and RMSE σ based on 1000 replications.The bold entries highlight the better performance in each case.

Table 2 .
Performance of the prediction in terms of MSFE.The main entries report the MSFE based on 500 replications.The bold entries highlight the better performance in each case.

Table 3 .
Empirical rejection rates of the proposed tests (iid errors).
NOTE:The main entries report the empirical rejection rates based on 500 iterations.

Table 4 .
p-values for the tests of structural changes.NOTE: The main entries report the p-values based on 1000 bootstrap iterations.The bold entries indicate rejection rate at the 5% significance level.