A General Regression Changepoint Test for Time Series Data

ABSTRACT This article develops a test for a single changepoint in a general setting that allows for correlated time series regression errors, a seasonal cycle, time-varying regression factors, and covariate information. Within, a changepoint statistic is constructed from likelihood ratio principles and its asymptotic distribution is derived. The asymptotic distribution of the changepoint statistic is shown to be invariant of the seasonal cycle and the covariates should the latter obey some simple limit laws; however, the limit distribution depends on any time-varying factors. A new test based on ARMA residuals is developed and is shown to have favorable properties with finite samples. Driving our work is a changepoint analysis of the Mauna Loa record of monthly carbon dioxide concentrations. This series has a pronounced seasonal cycle, a nonlinear trend, heavily correlated regression errors, and covariate information in the form of climate oscillations. In the end, we find a prominent changepoint in the early 1990s, often attributed to the eruption of Mount Pinatubo, which cannot be explained by covariates. Supplementary materials for this article are available online.


Introduction
Undocumented changepoint tests are important diagnostic tools in many scientific endeavors. In climatology, for example, undocumented changepoint techniques can identify unrecorded times of weather station moves and gauge changes. Ignoring changepoint effects can result in spurious trend conclusions (Thorne et al. 2005;. Here, we develop a test for a single changepoint (the at most one or AMOC case) that allows for a variety of features that the practitioner may encounter: trends, seasonality, covariates, and correlated regression errors.
Our basic linear regression model is data and factors are observed for the time indices t = 1, . . . , n ( denotes matrix transpose). The components of this regression model are described as follows. The components {x t } and {w t } allow for a long-term temporal trend. All trend factors that could have a changepoint are aggregated into {x t }, while {w t } is posited to be changepoint free over the observation times. The component {s t } governs a deterministic seasonal cycle of period T and has dimension p s . Changepoints in the regression coefficients for the seasonal cycle are not dealt with in the main article; however, our supplementary material considers cases where the seasonal cycle is allowed to shift. The period T is taken as known; cases where T must be estimated arise and are considerably more involved (Martin and Kedem 1993;Lii and Rosenblatt 2006;Wang, Chen, and Huang 2006 The changepoint component δ t is a length-p x vector that quantifies changes in the coefficients of x t with structure: Here, c is the unknown changepoint time and is a p xdimensional vector denoting the magnitude of the shift change. We want to test H 0 : = 0 (the null hypothesis of no change) against H 1 : = 0. At least one element (but not necessarily all) of must be nonzero under the alternative.
The regression design is described as follows. First, x t and w t are deterministic points sampled from continuous functions. It is convenient to scale times to lie in [0,1]; this will not alter the changepoint conclusions for polynomial trend but may change estimated regression coefficients. As such, we take x t = ( f 1 (t/n), . . . , f p x (t/n)) and w t = ( f p x +1 (t/n), . . . , f p x +p w (t/n)) , where f j (·) is a continuous function defined on [0, 1] for each j. The vector s t contains deterministic periodic design points for each time t with s t = s t+T . As a mean is present in the model, we impose T t=1 s t = 0 for parameter identifiability. Finally, {v t } contains random design points, assumed stationary in time t, where E[v t ] = 0 and var(v t ) = for t = 1, . . . , n. Nonstationary {v t } could be considered when E[v t ] is a linear combination of the remaining regression factors (see Section 6 for details). Parameter changes in (1) are restricted to coefficients in the trend factors, which is the typical case of interest. While such parsimony improves changepoint detection power, the supplementary material consider cases where the seasonal cycle and covariates change.
The following assumptions are imposed upon the predictor sequences: Assumption 1. Each component of the random covariate sequence {v t } satisfies a functional central limit theorem (the conditions of Lemma A.1 in the appendix).
Assumption 2. The functions f 1 through f p x +p w are continuous and differentiable over the set C ⊂ [0, 1] where C = {z ∈ [0, 1] : k = nz is an admissible changepoint time}. We also impose that f 2 j > 0 over the set C for j = 1, . . . , p x . Considerations for the set of admissible changepoint times are given later.
Assumption 3. Let χ t = (x t , w t , s t , v t ) . The matrix n −1 n t=1 χ t χ t is invertible for each n ≥ 1 with probability 1 in that it has a probability limit with a minimum eigenvalue that is bounded away from zero.
The above assumptions are mild. For technical conditions with the errors, we impose that { t } is a zero mean stationary causal time series satisfying where {Z t } is zero mean and independent and identically distributed (IID) with variance σ 2 and a finite (2 + η)th moment for some η > 0. We posit the geometrically decaying structure |ψ j | ≤ κr − j for all j ≥ 0 for some finite κ and r > 1. All causal autoregressive moving-average time series (ARMA) admit such a representation if their errors can be assumed IID (Brockwell and Davis 1991). A quantity that will arise later is the long-run variance From the causal linear process representation assumed in (3), the covariance γ (h) = cov( t+h , t ) is absolutely summable over all lags h (this is the so-called short memory property) and For estimation purposes, parametric models for { t } are useful. It will be convenient to assume that the errors obey the ARMA(p ar , q ma ) difference equation for {Z t } as described above. Here, φ i and θ j denote ARMA coefficients for a causal and invertible ARMA model. Changepoint asymptotic theory has seen a rich mathematical history. We credit Page (1954Page ( , 1955 with developing the area. Much of the original asymptotic theory is rigorized in MacNeill (1974). Monographs on the topic include Csörgő and Horváth (1997), which presents a comprehensive overview of changepoint limit theorems, and Brodsky and Darkhovsky (1993), which gives a thorough treatment of nonparametric methods for changepoint detection. Further, Bayesian methods, wherein a prior distribution in placed upon the time of the change, have become popular for changepoint analysis (Smith 1975;Carlin, Gelfand, and Smith 1992;Elsner, Niu, and Jagger 2004). Alternatively, frequentist approaches are used to estimate the time of an undocumented changepoint (Bai 1997;Jandhyala and Fotopoulos 1999;Perron and Zhu 2005) and to test the level of statistical significance of the changepoint (which is the focus here). Hypothesis tests for an unknown changepoint are frequently performed by maximizing a changepoint test statistic across a set of admissible changepoint times. These "maximally selected" test statistics are commonly shown to be asymptotically equivalent to the supremum of a Gaussian process or a related quadratic form (Andrews 1993;Bai 1993;Robbins et al. 2011a); further approximations are procured through extreme value distributions (Gombay and Horváth 1990;Horváth 1993), which are known to observe particularly slow rates of convergence. Improved critical values can be obtained through resampling procedures (Kirch 2007;Hušková et al. 2008); such techniques also have utility in settings involving autocorrelation (which may slow convergence further).
While studies considering seasonal effects in changepoint modeling are limited, Aue, Horváth, and Hušková (2012) considered structural changes in a model with a general regression structure and seasonal predictors. To illuminate the differences between our work and theirs, we model autocorrelation using an autoregressive moving average (ARMA) equation, while they allow for autocorrelation via a Bartlett-based variance estimate of a statistic (we employ a similar approach in the development of a competitor to our preferred method). Aue, Horváth, and Hušková (2012) used extreme value asymptotics to provide the limiting distribution of the maximally selected Gaussian likelihood ratio test statistic without truncating the set of admissible changepoint times. To avoid the need for extreme value laws, we truncate edge boundaries and establish a supremum of a Gaussian process limit law. This (in combination with the use of an ARMA model) obviates the need for bootstrapping techniques to obtain critical values, a tool often used in settings with autocorrelated data. However, unlike Aue, Horváth, and Hušková (2012), the asymptotic distribution of our test statistics (under H 0 ) is influenced by the structure of the functional form predictors (i.e., {x t } and {w t }). Lastly, we note that Aue, Horváth, and Hušková (2012) impose that all regression parameters change under the alternative hypotheses (although they suggested that their procedures can be modified to relax this restriction). In contrast, this article focuses explicitly on the case that only some regression parameters are impacted by the change. Our interest lies with developing a changepoint test for a setting that simultaneously allows for time-varying factors, seasonality, covariates, and strong autocorrelation; none of the extant procedures incorporate all of these features. Driving this development is our analysis of the monthly Mauna Loa carbon dioxide series, which is plotted in Figure 1. The Mauna Loa series has a clear nonlinear trend and a pronounced seasonal cycle. Later, the series is shown to be strongly correlated in time. Even moderate levels of positive autocorrelation can drastically degrade changepoint tests ). Lund and Reeves (2002) found a changepoint circa 1990 in this series that is often attributed to the eruption of Mount Pinatubo in 1991. We reexamine this conclusion here with an additional decade of data and covariate information. In particular, we want to discern whether or not El Nino effects (our covariate) could explain the detected changepoint. Pinpointing volcanic eruptions as events that induce atmospheric cooling (here, reduction of carbon dioxide levels) is an important issue in modern climatology (Robock and Mao 1995;Ammann et al. 2003;Gao, Robock, and Ammann 2008).
We develop methods that are applicable in the setting described above. Our preferred technique (which differs from most extant tools) is based on ARMA residuals and is shown to perform favorably in practical circumstances, including those where regression errors have strong autocorrelation, when asymptotic approximations are used; as such, we do not need to pursue resampling techniques. The methods here apply to the at most one changepoint (AMOC) case. While any AMOC changepoint procedure can be used with subsegmentation techniques to make multiple changepoint inferences (Menne andWilliams Jr. 2005, 2009;Robbins et al. 2011b, are climate examples), caution should be exercised in such pursuits. Elaborating, if a prominent seasonal mean cycle exists, a multiple changepoint analysis could "track the seasonal mean cycle, " erroneously concluding that many changepoints exist. This is a situation to avoid.
The rest of this article proceeds as follows. The next section establishes notation and reports our basic results. A short section thereafter applies the results to five examples of envisioned utility. A simulation section then studies the asymptotic results in finite sample size settings. The Mauna Loa data are analyzed in detail in Section 5. Proofs are delegated to an Appendix; supplementary material presents theory for linear models in addition to results for generalizations of the model that allow for seasonal cycle and covariate shifts.

Methodology
We begin with statistics designed for IID errors; these statistics will prove useful in time series settings.

Tests Assuming IID Errors
Momentarily, assume that the changepoint time is known and occurs at time k and that the errors { t } are IID. Let k denote the ordinary least-square (OLS) estimator of and var( k ) its variance/covariance matrix. The Wald statistic for a test of = 0 is Wald-type statistics commonly underpin changepoint methods (Andrews 1993;Bai 1997). The statistic F k has little utility due to unknown quantities in its variance matrix. To derive approximations for this variance matrix, note that it has the form where C k is a p x × p x dimensional known matrix that may depend on k and τ 2 ≡ var( t ) is computed assuming IID errors. Good approximations of var( k ) include the null hypothesis estimate var( k ) =τ 2 C −1 k , whereτ 2 is the OLS estimate of τ 2 under H 0 , and an alternative hypothesis estimator, var( k ) = τ 2 k C −1 k , whereτ 2 k is the OLS estimate of τ 2 when a changepoint occurs at time k.
Versions of (6) that use these approximations are and is the number of regression parameters in the null hypothesis model. F k is equivalent to the classic F and Gaussian likelihood ratio tests.
When the changepoint time is unknown, we consider the largest F k and F k : where 0 < < h < 1. The set of admissible changepoints is truncated to ensure that the statistics in (9) are asymptotically bounded. The values of and h are typically chosen to be near zero and unity, respectively, so as to make most changepoint times admissible. We prefer = 1 − h = 0.05, which seems to provide adequate coverage of changepoint times while yielding sufficient temporal distance from the endpoints in most settings (although other choices of and h are considered in Section 5). We suggest that smaller values of and 1 − h be used only in very large samples. Both estimators of τ are known to be consistent in that under H 0 , which implies that F and F have the same asymptotic distribution. Our discourse will focus on F in favor of F for computational convenience.
Although F k has an F distribution for finite Gaussian samples, the small sample distributions of F and F are, for practical purposes, intractable. One could simulate null hypothesis percentiles; however, such a simulation would need to be done for each application as the null hypothesis distribution can depend on time series parameters (in the time series setting) in addition to the sample size. Alternatively, asymptotic theory offers a practical and effective tool for quantifying the sampling distribution of complex changepoint statistics (see Csörgő and Horváth 1997;Gallagher, Lund, and Robbins 2012, among others); this is the approach taken herein.
To extract the asymptotic distribution of F, we begin by expressing it (in addition to k and its variance matrix) as functions of the regression errors/residuals. First, the null hypothesis Here, {ˆ t } are called OLS residuals. Define One can show that var(N k ) = τ 2 C k and that k = −C −1 k N k . These formulas yield Sinceτ P −→ τ as n → ∞, quantification of the large sample behavior of N k and C k across values of k will yield the limit distribution of F. In addition to its comparative simplicity, (12) is more convenient than (8) for recursive computation across changepoint times. Proceeding further requires additional notation. Recall that the time t design vector is χ t = (x t , w t , s t , v t ) and denote the null hypothesis design matrix via M 0 = (χ 1 , . . . , χ n ) , which has dimension n × p tot . Let for t = 1, . . . , n, which is a matrix of dimension n × p x whose last n − t rows contain zeros. Letting M k = X n − X k , the full design matrix under the alternative can be represented via M a = (M 0 , M k ). Within the supplementary material to this article, it is shown that where The vector of null hypothesis OLS residuals can be expressed as MacNeill (1989, 1991), Tang and MacNeill (1993), Antoch, Hušková, and Prášková (1997), and Bischoff (1998) (among others) described large sample properties of residuals-based processes akin to {N k }; discussion of the limiting distribution of {N k } in our setting first requires the introduction of some quantities that arise therein. While these distributions will depend on (x t , w t ) = ( f 1 (t/n), . . . , f p x +p w (t/n)), they will be invariant of s t and v t . The limiting distributions will involve a Brownian motion whereḟ denotes the derivative of f. For intuition on where these quantities arise, Lemma A.1 in the appendix states that as n → ∞. Here and throughout, ⇒ denotes weak convergence in the space of right-continuous functions with left-hand limits, Additionally, let G(z) contain the first p x columns of G(z) and let g(z) be a p x × p x matrix containing the upper-left square of G(z). Our limiting results will involve the deterministic quantity and the random multivariate Gaussian process The large sample behavior of {C k } and {N k } under the null hypothesis is quantified in the following lemma, which is proven in the appendix.
Lemma 2.1. Assume that the data {Y t } obey the model in (1), that the errors { t } are IID with variance τ 2 , and that H 0 and Assumptions 1-3 hold. Then as n → ∞ The result below now follows directly from the above lemma.
for z ∈ [0, 1]. If the conditions of Lemma 2.1 hold andτ 2 is a consistent estimator of τ 2 , then where D −→ is used to denote convergence in distribution.
Formula (10) implies that F has the same limit distribution as F. A noteworthy consequence is that while {x t } and {w t } influence the limit distribution, {s t } and {v t } do not.

Tests for Time Series Errors
The results of the previous subsection will now be used to handle autocorrelated errors { t }. In general, asymptotic procedures for correlated data that ignore correlation aspects can perform poorly.
A consistent estimator of τ 2 under H 0 is needed. No assumptions beyond (3) are needed for such an estimate. Specifically, the Bartlett-based estimatê (21) could be used. Here, q n is a bandwidth that diverges to infinity as n → ∞, but the divergence is slow enough to ensure that |τ 2 B − τ | = o p (1). A typical recommendation takes q n = n 1/3 . Unfortunately,τ 2 B can be inefficient. A parametric alternative assumes that { t } obeys the ARMA(p ar , q ma ) formulation of (5). One may then calculate {Ẑ t }, a sequence of ARMA residuals, via the following computationally convenient scheme: where {ˆ t } are OLS residuals andφ i , i = 1, . . . , p ar andθ j , j = 1, . . . , q ma are consistent ARMA parameters calculated under H 0 . In practice, the recursion in (22) is started usingˆ t = 0 =Ẑ t for t ≤ 0. When the observed data obey (1) but with ARMA errors, standard ARMA fitting procedures when applied to the OLS residuals will yield consistent estimates of the ARMA coefficients (Fuller 2009) which is the estimated spectral density of { t } at frequency zero. When { t } is stationary but not IID, the representation in (3) implies that { t } obeys a functional central limit theorem. Thereby, Lemma A.1 implies that the convergence in (16) still holds. Hence, the conclusions in Lemma 2.1 applies, and we have the following result.
Corollary 2.1. Suppose that the assumptions in Lemma 2.1 hold, with the amendment that { t } obeys a causal and invertible ARMA recursion. Then ifτ 2 is any consistent estimator of τ 2 , as n → ∞, Partial sum sequences can converge slowly (to Gaussian processes) when the errors are strongly correlated. Hence, the F statistic can perform poorly for even moderate sample sizes. It is desirable to construct a changepoint statistic based on ARMA residuals in the vein of (9). First, we establish a relationship between the partial sums of OLS and ARMA residuals.
The proof is in the appendix and is motivated by Lemma 1 of Robbins et al. (2011a). Now let x tẐt , and note that Lemma 2.2 implies an asymptotic equivalence between N k /τ and R k /σ . Replacing N k /τ with R k /σ in (12), we obtain a residuals-based statistic where C k is calculated via (14). A consequence of Lemma 2.2 is that Although the finite sample performance of methods presented here is explored for a variety of alternatives in Section 4, limit theory under H 1 is omitted for brevity. Andrews (1993), Hansen (2000), and Aue, Horváth, and Hušková (2012) studied related changepoint tests and illustrated that when H 1 is true, power increases with increasing n and/or increasing magnitudes of departure from H 0 ; we expect similar results to hold for the methods described herein.

Examples
This section presents five examples where the methods appear useful. While polynomial trends are written as functions of t, conclusions are for models where time is scaled by t/n. Further, the error series { t } may or may not be autocorrelated; the presence of such correlation does not influence the large sample behavior of our test statistics.

Example 1. Perhaps the simplest regression model with seasonal features is
The objective is to assess whether or not α 0 shifts. A seasonal cycle confounds the inference. This setting models a monthly temperature time series with a possible undocumented station location move or gauge change. Wang, Wen, and Wu (2007) and Lund et al. (2007) studied this case. In such applications, { t } often has moderate autocorrelation.
Calculations (1). The limiting distribution in (24) and (26) in this case has form where {B(z)} z=1 z=0 is the classical Brownian bridge process defined pointwise by B(z) = W (z) − zW (1). Percentiles of this limit distribution are well known (see Csörgő and Horváth 1997;Gallagher, Lund, and Robbins 2013, e.g.). Notice that the seasonal cycle does not enter into the asymptotics.
Example 2. A more complex scenario allows for a linear trend and a covariate: Changepoints in correlated series with linear trends is a frequently studied issue in a multitude of disciplines, including economics (e.g., Perron 1989;Perron and Zhu 2005), where changepoints are often considered as alternatives to models with a unit root. In this example, inference issues are aided by a covariate {v t }. This situation arises in the analyses of an annual temperature series that has a trend induced by climate change; the covariate {v t } could be a record of temperatures from a nearby location, and the changepoint may be induced by a station location move. In these settings, correlation in { t } is needed.
Suppose first that only α 0 is allowed to change at the changepoint time. In this case, The limit distribution in Corollary 2.1 and Theorem 2.2 can be shown to have form where {B(z)} z=1 z=0 is again a Brownian bridge. Now suppose that both α 0 and α 1 are allowed to change at the changepoint time. In this case, .
Further computations can link { (z)} z=1 z=0 to the Brownian bridge via Notice that the limit distribution does not depend on {v t }.
Example 3. Now consider the Mauna Loa series. A simple model appropriate for an annualized version of this data without a covariate is For this series, autocorrelations in { t } are large and positive. This model is considered in Ben-David and Papell (1997) and Beaulieu, Chen, and Sarmiento (2012), among others. If only α 0 is allowed to change, the limiting processes can be derived via (28) and (29) in Example 4.
Issues become more complex for the monthly Mauna Loa series with a covariate and seasonal cycle. Here, our model is and all three parameters α 0 , α 1 , and α 2 may change at the changepoint time. In this case, the structure of the limiting distribution in (26) is unwieldy to write down; however, one can always simulate null hypothesis percentiles from the Theorem 2.2 representation. The limiting distribution does not depend on the covariate or the seasonal cycle.
Example 4. This example examines the general case where only the intercept parameter changes. Specifically, the regression is where f w (t/n) = ( f 2 (t/n), . . . , f p w +1 (t/n)) and only α 0 is allowed to change. This setting has been addressed by several authors through examination of regression residuals (Brown, Durbin, and Evans 1975;Gombay and Horváth 1994). MacNeill (1978) derived the limit distribution of the partial sums sequence of OLS residuals, which is can extract a Brownian bridge representation. Specifically, and whereḟ w is the derivative of f w . The representation in (29) is derived using integration by parts and therefore assumes that f w is continuously differentiable over [0, 1]. Adding a covariate or seasonal cycle into (27) does not change the limiting distribution.
Example 5. Finally, consider a situation where the outcome and/or a stochastic covariate has a "nuisance" changepoint (intervention) at the known time m ∈ {2, . . . , n}. Our underlying regression model is where { t } is zero mean stationary random noise and I A is the indicator of the event A. The univariate stochastic covariate {v t } is also posited to have an intervention shift at time m: where {u t } is a stationary zero mean error sequence. The βI {t≤m} term is included in (30) because of the intervention. We want to test whether the parameter μ changes at an unknown time c, where c = m. Here, x t = 1 and w t = I {t≤m} . The intervention time m is a function of n-call it m(n)that satisfies m(n)/n → ω ∈ (0, 1) as n → ∞. Furthermore, f 1 (u) ≡ 1 and f 2 (z) = I {z≤ω} . The above theory is applicable if time ω is avoided; specifically, Assumption 2 holds since f 2 is continuous on [0,1] except at ω. Without loss of generality, the covariate {v t } can be treated as being stationary since the model in (30) is equivalent to one of form where μ * = μ + ζ μ v , and β * = β + ζ v . Section 6 discusses this aspect further.

A Simulation Study
This section studies the finite sample efficacy of our methods via simulation. We isolate to the four models listed in Table 1. The first model, which represents Example 1 above, contains an intercept (which is allowed to change under H 1 ) and a sinusoidal seasonal cycle. The second model, representing the vein of Example 2, has a linear trend (both parameters are allowed to change under H 1 ) and an ergodic covariate {v t }. The third model has a quadratic trend only, and only the coefficient corresponding to the intercept term can change under H 1 . The fourth model is structurally the most complex and will be used in our Mauna Loa application. It has a quadratic trend (each coefficient of which is allowed to change under H 1 ), a sinusoidal seasonal cycle, and an ergodic covariate {v t }. The covariate is a synthetically generated version of the ENSO covariate in Section 5 (a synthetic version is used so that any sample size can be studied). We do not expect regression coefficient values to heavily impact simulated Type I and Type II errors. Hence, for realism in the fourth model, we take (α 0 , α 1 , α 2 ) as the values estimated under the H 0 column of Table 7; moreover, we take β = (1.125, 2.559) and ζ = −0.0139. Since each predictor in the first three models also arises in the fourth model, regression coefficients in the first three models are set to those values used in the fourth model.
To use the aforementioned changepoint tests, the null hypothesis percentiles of each limiting distribution must be computed for each model. As mentioned above, in Example 3.1, z=0 is a scaled and squared Brownian Bridge. This process was discussed by Csörgő and Horváth (1997) and Robbins et al. (2011a), and percentiles of the process are given in Table  2 of Gallagher, Lund, and Robbins (2013). Table 2 contains null hypothesis percentiles for Models 2, 3, and 4. The limit behavior of Model 2 is also analyzed in Kim and Siegmund (1989).
For each regression structure, we consider four ARMA error models of varying complexity: AR(1), MA(1), AR(3), and ARMA(2,2)-the ARMA coefficients are listed in Table 3. We expect ARMA parameters to influence Type I and Type II error rates. These models are constructed using a single parameter ρ that places a root in the autoregressive polynomial at strategic places, thereby enabling us to vary error autocorrelation strength. For example, in Error Model 1, ρ = φ 1 is used. The ARMA error variance σ 2 = 0.09511 was used in all simulations; this is the value estimated in our Mauna Loa application.
The tests studied in this section are as follows. First, F τ is a version of F calculated from a known τ 2 . Second, F B and F A denote versions of F calculated by estimating τ 2 from (21) and (23), respectively. Third, the ARMA residuals-based statistic L of (26) is computed. Unless otherwise noted, q n = n 1/3 was used to calculateτ 2 B . The Type I error α = 0.05 and = 1 − h = 0.05 were used in all tests, as were the critical values in Table 2. Table . Models used in our simulation study.
Regression Model   -(cos(2πt/12), sin(2πt/12)) - Table . Percentiles (γ ) of the limiting distribution for various values of h = 1 − for Regression Models , , and  (where x t = f x (t/n) and w t = f w (t/n)). Quantities are based on ,, independent generations of the respective limiting process. Order Structure To begin, Type I error probabilities were empirically aggregated from 50,000 independent simulations. Here, the series length n = 1000 and values of ρ listed in Table 4 were used. The results show that the ARMA residuals L test has a fairly stable Type I error, whereas the other tests perform poorly in certain settings. The tests tend to be conservative, which is in line with standard observations for similar changepoint problems (Csörgő and Horváth 1997; Robbins et al. 2011a). Figure 2 further investigates the effects of finite sample sizes. The left-hand graph shows Type I error probabilities as a function of ρ = φ 1 under Regression Model 4 and Error Model 1 when n = 1000. The results show the stability of the ARMA residuals-based test and the erratic behavior of the other tests as |φ 1 | approaches a nonstationary value of unity. Tests that do not use the ARMA residuals cannot seemingly be fixed by estimating τ : even the F τ test is ill behaved. The right-hand graph plots estimated Type I errors as a function of n, which provides feel for the convergence properties of the tests.
Turning to power, we begin by imposing the mean shift = 1/10, where 1 is a vector of ones with the same dimension as x t . Table 5 provides simulated powers for various n, ρ, c, and error models. The results show that the tests are comparable in Table . Simulated Type I error probabilities with n = 1000 for various regression and error models. Entries are based on , independent simulated series for each entry.   power; if one test appears substantially more powerful, it is likely attributed to an inflated Type I error probability for the respective setting.
To further evaluate power, we simulate series from Regression Model 4 with Error Model 1 when = d , where the value of is taken from Table 7 in Section 5. Figure 3 shows estimated powers as a function of d for various ρ. The results show that L is more powerful than F B for values of ρ for which F B has an acceptable Type I error.

Mauna LOA Data Analysis
This section applies our methods to the carbon dioxide measurements taken at the Mauna Loa Observatory on the Big Island in Hawaii. There are n = 662 monthly observations, spanning March 1958 to April 2013. The null hypothesis model for the amount of carbon dioxide measured at time t is γ 1, j cos 2π jt 12 + γ 2, j sin 2π jt 12 where v t is the El Niño/Southern Oscillation (ENSO) index at time t. The ENSO value at the same season 1 year ago is usedthis covariate has demonstrated predictive power in other climate studies. All terms in the quadratic trend are allowed to change under H 1 , that is, x t = (1, (t/n), (t/n) 2 ) and w t is null. Regression Model 4 from Section 4 is a version of the above (with  fewer seasonal components). Pilot computations indicate that the seasonal cycle can be described reasonably well with four harmonics.
The data are plotted in Figure 4 along with the fitted null hypothesis model. The data seem poorly described during some time intervals, implying that a changepoint model may be useful. The bottom panel in Figure 4 shows the OLS residuals. The residuals appear to contain structural breaks and strong positive autocorrelation. The Akaike information criterion (AIC) model selection criterion selects an AR(14) model for the OLS residuals. Since (i) AIC tends to overparameterize and (ii) a simple AR(2) model had almost the same AIC score, we model the OLS residuals as AR(2). Estimated coefficients areφ 1 = 0.702 and φ 2 = 0.216.
Next, the F B , F A , and L statistics were calculated. Table 6 shows the results. Various values of p ar (which is needed in the F A and L tests) and q n (which is used in the F B test) are provided for feel. The F B and F A p-values seem to depend highly on q n and  The L test provides convincing evidence of a significant changepoint. The simulations in the previous section give credence that L makes trustworthy decisions with strongly autocorrelated errors. The estimated changepoint time is July 1991 as was obtained in each and every test. This breakpoint time seems reasonable in view of Figure 4. The changepoint time is very close to the June 15, 1991 eruption of Mount Pinatubo in the Philippines, which was the second largest volcanic eruption during the 20th century. Again, climatologists suspect that volcanic eruptions significantly influence carbon dioxide and temperatures (Ammann et al. 2003) via sulfur dioxide.
Finally, we address how the Pinatubo eruption influenced carbon dioxide concentrations. Figure 5 shows the fitted quadratic model and OLS residuals that allow for the 1991 changepoint. The improved fit is evident. Here, a simple AR(1) model withφ 1 = 0.6915 provided a good fit to the OLS residuals when the changepoint is included. Table 7 lists quadratic trend estimates under both H 0 and H 1 . The seasonal cycle and ENSO covariate regression coefficients seem largely uninformative and are hence not provided (recall that these components are not allowed to change under H 1 ). Overall, the changepoint has acted to slow the rate of increase in carbon dioxide concentrations.

Discussion
Prior to concluding the article, we offer some discussion of generalizations and extensions of our methods. Primarily, we focus on the inclusion of covariate sequences that are not stationary; we also consider ways to further generalize the model in (1). If E[v t ] lies in the linear span of the other factors describing {Y t }, one can assume that E[v t ] = 0. To see this, suppose that {u t } is a latent random sequence that satisfies the assumptions imposed upon {v t } in Section 1, whereas the observed covariate sequence is generated via , w t , s t ) . For any vectors a and b of appropriate length, whereã = a + Hb. Consequentially, the vector space spanned by the columns of M 0 is equivalent to the vector space spanned by a version of M 0 with columns corresponding to {u t } replaced by {v t }. It follows that a model that uses v t in place of u t will yield the same predicted values of Y t and the same OLS residuals as the original model. Therefore, the two models yield the same Wald-type changepoint statistics. Similar arguments show that time-varying predictors do not always need to be written as functions of t/n. For example, if f j (x) = x j , then f j (t/n) = (t/n) j = t j /n j . Replacing the column of M 0 for f j (·) with t j preserves the column space of M 0 .
As a final point on nonstationary covariates, it is feasible for stochastic covariates to have discontinuities in their mean (e.g., Example 5). One could include an indicator factor as a functional form predictor for {Y t } as in Example 5 if this shift time is known, although it may not be desirable to do so. If a stochastic predictor has discontinuities in its mean sequence that are not in the linear span of the other predictor variables in the null hypothesis model for {Y t }, our suggestion is to first homogenize {v t }. The theory outlined in this article holds for homogenized (stationary and ergodic) {v t }.
Inclusion of a discontinuous {v t } in the H 0 model for {Y t } will only confound inferences regarding the fundamental objective: detection of shifts in the trend function underlying {Y t }. Nonetheless, the supplementary material considers the limiting behavior of our Wald-type changepoint statistic when a discontinuous stochastic covariate is included in the regression for {Y t }. While some of our results break down, examples are given where a limit distribution is encountered that fuses the limits in Examples 1 and 5. Here, one may need to develop a resampling procedure to obtain appropriate critical values. In this case, the limiting distribution of our test statistic depends on the statistical properties of {v t }.
Further generalizations to our changepoint model are possible. The supplementary material also examines our changepoint statistic in settings where the regression coefficients governing the seasonal cycle are allowed to shift. There, we show that some of the mathematical simplifications in this article do not always carry over to these settings.
In conclusion, we have introduced a test for a changepoint in a general regression time series model, which unifies model characteristics that are frequently encountered in practice: timevarying trend, seasonality, stochastic covariates, and correlated errors. Through modeling the error process via an ARMA recursion and then extracting ARMA residuals, a novel test statistic is constructed that offers vastly improved performance on finite samples. the functional form predictors x t and w t ). Furthermore, let χ * t = (s t , v t ) and M * 0 = ( χ * 1 , . . . , χ * n ) (so that M 0 = ( M 0 , M * 0 )). It follows from (A.1) and Lemma A.2 that n −1 M 0 M * 0 = O p (n −1/2 ).
Using standard formulas for partitioned (block) matrix inversion, (A.1), and Lemma A.2, we have Likewise, where max ≤ k n ≤h ||A k || ∞ = O p (n −1/2 ). From the above two formulas, it follows that where max ≤ k n ≤h ||B k || ∞ = O p (n −1/2 ). Let P 0 denote the orthogonal projection matrix of M 0 . From (A.2) and (A.3), n −1 C k = n −1 X k X k − n −1 X k P 0 X k = X k X k /n − X k P 0 X k /n + D k where max ≤ k n ≤h ||D k || ∞ = O p (n −1 ). From results such as X k X k /n = 1 n k t=1 x t x t → g(z), (A.4) where k = nz (which is used in what follows) with n → ∞, the convergence in (19) is evident. Under the null hypothesis, it follows from basic linear models theory that the (H 0 ) residuals vector has representation (I − P 0 )Y = (I − P 0 ) , so that n −1/2 N k = n −1/2 X k − n −1/2 X k P 0 .
(A.5) Lemma A.1 and an application of the Cramér-Wold device now give which is the convergence in (16).

Supplementary Materials
The supplementary materials to this article include illustration of the linear model theory used to derive several quantities needed in calculation of our test statistics. A pair of generalizations of the model in (1) is also included: we consider changepoint alternatives that allow the coefficients governing the seasonal regressors to change, and we examine settings where the stochastic predictors are discontinuous.