Changepoint Detection in Heteroscedastic Random Coefficient Autoregressive Models

Abstract We propose a family of CUSUM-based statistics to detect the presence of changepoints in the deterministic part of the autoregressive parameter in a Random Coefficient Autoregressive (RCA) sequence. Our tests can be applied irrespective of whether the sequence is stationary or not, and no prior knowledge of stationarity or lack thereof is required. Similarly, our tests can be applied even when the error term and the stochastic part of the autoregressive coefficient are non iid, covering the cases of conditional volatility and shifts in the variance, again without requiring any prior knowledge as to the presence or type thereof. In order to ensure the ability to detect breaks at sample endpoints, we propose weighted CUSUM statistics, deriving the asymptotics for virtually all possible weighing schemes, including the standardized CUSUM process (for which we derive a Darling-Erdős theorem) and even heavier weights (so-called Rényi statistics). Simulations show that our procedures work very well in finite samples. We complement our theory with an application to several financial time series.


Introduction
In this article we study the stability of the autoregressive parameter of a (possibly nonstationary and/or heteroscedastic) RCA(1) sequence: where 1 ≤ i ≤ N. We test for the null hypothesis of no change versus the alternative of at most one change (AMOC) 2) using, when needed, the short-hand notation The RCA model was first studied by Anděl (1976) and Nicholls and Quinn (2012) ; arguably, (1.1) is very flexible, allowing for the autoregressive "root" β 0 + i,1 to vary over time, and thus for the possibility of having stationary and nonstationary regimes.Further, (1.1) allows for conditional heteroscedasticity in y i ; Tsay (1987) shows that the widely popular ARCH model (Engle 1982) can be cast into (1.1), which therefore can be viewed as a second-order equivalent.Finally, as we also discuss below, a major advantage of (1.1) compared to standard autoregressive models is that it is possible to construct estimators of β 0 that are always asymptotically normal, irrespective of whether y i is stationary or nonstationary.
Given such generality and flexibility, the RCA model has been used in many applied sciences-see Nicholls and Quinn (2012) and Regis, Serra, and van den Heuvel (2022)  examples.The inferential theory for (1.1) has been studied extensively.Schick (1996), Koul and Schick (1996) and Hwang and Basawa (2005), inter alia, study WLS estimation of β 0 ; Berkes, Horváth, and Ling (2009) and Aue and Horváth (2011) study Quasi Maximum Likelihood estimation, and Hill, Li, and Peng (2016) develop an Empirical Likelihood estimator.In contrast, whilst changepoint detection in time series is very wellstudied in general (see the reviews by Perron 2006, andCasini andPerron 2019), it is still under-explored in the RCA framework.To the best of our knowledge, the only exceptions are Lee (1998), Lee et al. (2003), and Aue (2004); in these contributions, a test based on the unweighted CUSUM process is proposed, but only for the stationary case.1

Contribution of this article
In this article, we develop a family of tests for breaks in β 0 in ( 1.1).We base our tests on maximally selected, weighted CUSUM statistics based on the WLS estimator (see Kim and Perron 2009, for an insightful analysis of the optimality of max-type statistics).Our paper makes the following three contributions.
First, and foremost, all our results hold irrespective of whether y i is stationary or not.This is a consequence of using the WLS estimator, which, when applied to the RCA model, has been shown to be asymptotically normal even when y i is nonstationary by Hwang and Basawa (2005) (see also Aue andHorváth 2011, andHill andPeng 2014).In this respect, our work is-in principle-related to the literature on "uniform" inference (i.e., inference where asymptotic normality is valid irrespective of stationarity or not): although a comprehensive review of this literature goes beyond the scope of our article, our results can be read in conjunction with the theory developed by Chan, Li, and Peng (2012) and Hwang, Basawa, and Kim (2006) (see also Li, Phillips, and Gao 2016, for kernel regression).From a practical point of view, this entails that our tests can be applied with no modifications required, and no prior knowledge of the stationarity of y i or lack thereof.Having a procedure to detect breaks also under nonstationarity is bound to be useful in a wide variety of applications, and we refer for example, to a recent contribution by Andersen and Varneskov (2022), and the references therein, for a discussion.In particular, with our tests, it is possible to detect changes from stationary to nonstationary/explosive behavior (as e.g., in Horváth et al. 2020); and it is also possible to test for changes from explosive to nonexplosive (or "less explosive") behavior.Being able to accommodate both cases is a distinctive advantage of combining the RCA model with the WLS estimator, thus, exploiting the fact that asymptotic normality holds irrespective of whether y i is stationary or not.Hence, a possible application of our tests is to detect the beginning/end of a bubble; interestingly, the dynamics of the bubble component of asset prices, when present, can be modelled as an explosive RCA process (see in in particular equation ( 17) in Diba and Grossman 1988).Tests for bubble episodes based on a standard, linear AR model have been proposed in several works (see, inter alia, Phillips, Wu, and Yu 2011;Homm and Breitung 2012;Phillips, Shi, and Yu 2015a;and Astill et al. 2021).In particular, Phillips, Shi, and Yu (2015a) and Phillips, Shi, and Yu (2015b) propose a methodology to consistently date bubbles collapsing, based on a first-crossing principle.However, a formal test would require developing the asymptotics under explosiveness in an AR model; this is a highly nontrivial task, with test statistics bound to depend on several nuisance parameters (see e.g., Harvey, Leybourne, and Whitehouse 2022).This may explain why, whilst tests for changes toward an explosive behavior have been developed in the literature, tests to detect changes from an explosive behavior are more rare.Our tests, on the other hand, can be applied to both situations.
Second, we allow for the simultaneous presence of conditional and unconditional heteroscedasticity in i,1 and i,2 .This is usually not considered in the RCA framework, where, to the best of our knowledge, all contributions assume iid innovations.In the wider context of time series, the case of non iid innovations has been paid significant attention by the literature: the contributions by Kuersteiner (2002) and Zhu (2019), among many others, study estimation of autoregressive models in the presence of conditional heteroscedasticity (see also Linton and Xiao 2019, for nonparametric estimation); the case of unconditional heteroscedasticity, where the variance of the innovations is subject to shifts over time, has also been investigated (see Xu and Phillips 2008, and the references therein).Allowing for non iid observations is also relevant when testing for changes in the persistence of a time series: tests under conditional heteroscedasticity are studied for example, in Chan, Yau, and Zhang (2014) and in Boldea, Cornea-Madeira, and Hall (2019), who develop a bootstrap test which is valid under both conditional and unconditional heteroscedasticity.In the case of unconditional heteroscedasticity, Cavaliere andTaylor (2006), andXu (2015) study the case of break detection in the presence of volatility shifts (see also Astill et al. 2021).This issue is also investigated, through the bootstrap, in Kejriwal, Yu, and Perron (2020), whose empirical illustration makes a strong case in favor of changepoint detection methodologies which account for heteroscedasticity; and in Perron, Yamamoto, and Zhou (2020) and Perron and Yamamoto (2022), who analyze the power properties of several existing changepoint testing methodologies designed to account for heteroscedasticity.In this contribution, we accommodate for a very general form of serial dependence in the innovations (including ARMA and ARCH/GARCH processes), showing that the only impact of this is on the variance of the weak limit of our test statistics, and proposing a "universal" estimator thereof; and also for the presence of shifts in the variances-irrespective of whether y i is stationary or not.Our approach can be described by using the words of Perron and Yamamoto (2022): "[...] first estimate the residuals assuming the null hypothesis of no change in the regression coefficients; second, use the residuals to approximate the heteroscedastic asymptotic distribution [...]".With this approach, we are able to control the size asymptotically, and again our test statistics can be used from the outset, with no prior knowledge as to whether i,1 , or i,2 , or both, is (conditionally and/or unconditionally) heteroscedastic.
Third, we study virtually all possible weighing schemes for the CUSUM process, considering weighing functions of the form [t (1 − t)] κ with 0 ≤ κ < ∞, for t ∈ [0, 1].Weighted, untrimmed CUSUM statistics are designed to be better at detecting changepoints occurring at the beginning/end of the sample ( Csörgő and Horváth 1997).In particular, the case κ = 1 2 corresponds to the standardized CUSUM process also proposed by Andrews (1993), whereas the more heavily weighted case κ > 1 2 corresponds to a family of test statistics known as "R ényi statistics" (see Horváth, Miller, and Rice 2020).From a practical viewpoint, this entails that our test statistics can detect breaks even when these are close to the sample endpoints.As we further show, using the weighted version of the CUSUM has the added bonus that, in the presence of multiple changepoints, the number of changes can be estimated consistently using binary segmentation, whereas using unweighted statistics can overstate it (see Vostrikova 1981;Horváth and Rice 2021).
The remainder of the article is organized as follows.We present our test statistics in Section 2, and study their asymptotics in iid case, as a benchmark, in Section 3.1.The non iid case is studied in Sections 3.2.1 and 3.2.2, and we extend our theory to consider the presence of deterministics in Section 3.3.We investigate the behavior of our test statistics under the alternative, and study estimation of the number of changepoints, in Section 4. In Section 5, we report a simulation exercise; an empirical illustration is in Section 6. Section 7 concludes.Further extensions (including the RCA(p) case, and a discussion on how to make our tests robust to shifts in the mean), lemmas and proofs are relegated to the Supplements materials.
NOTATION.We use the following notation: " D →" for weak convergence; " P →" for convergence in probability; "a.s." for "almost surely"; " D =" for equality in distribution; • is the integer value function.Positive, finite constants are denoted as c 0 , c 1 , ... and their value may change from line to line.Other notation is introduced further in the paper.

The Test Statistics
Our approach is based on comparing the estimates of β 0 before and after each point in time k, by dividing the data into two subsets at k and estimating the autoregressive parameter in both subsamples.The WLS estimators in each subsamples are given by 2 Our test statistics will be functionals of the process (2.1) A "natural" choice to detect the presence of a possible change is to use the sup-norm of (2.1), viz.sup 0<t<1 |Q N (t)|, but, as mentioned above, this choice may have low power in detecting changes which occur early or late in the sample.In order to enhance the power at sample endpoints, one can use weight functions: . (2.2) Assumption 2.1.It holds that: (i) inf δ≤t≤1−δ w(t) > 0 for all 0 < δ < 1/2; (ii) w(t) is non decreasing in a neighborhood of 0; (iii) w(t) is non increasing in a neighborhood of 1.
The functions w(t) satisfying Assumption 2.1 belong in a very wide class; a necessary and sufficient condition for the existence of the limit of (2.2) is that the integral as weights.The rationale for this choice is based on considering the "error term" i,1 y i−1 + i,2 , whose (conditional) variance is var 0,1 y 2 i−1 + var 0,2 , whence the weights containing the y 2 i−1 term.Koul and Schick (1996) show that the WLS estimator is first order equivalent to MLE.The use of the weight function 1 Schick (1996), where it is also shown that efficiency is attained when using var 0,1 and var 0,2 , or consistent estimators thereof.However, in our context we also allow for nonstationarity, and in this case var 0,2 cannot be estimated consistently (see e.g., Lemma A10 in Horváth and Trapani 2019).
In order to further enhance the power of our testing procedures, functions which place more weight at the sample endpoints can also be used, that is, (2.4) with κ ≥ 1 2 .In particular, when κ = 1 2 , the corresponding limit theorems will be of the Darling-Erdős type ( Darling and Erdős 1956).

Changepoint Testing in the RCA Framework
In this section, we begin by presenting our main results for the iid case; albeit only of theoretical interest, the results in Section 3.1 illustrate how our test statistics are not affected by stationarity or the lack thereof.The case of heteroscedastic innovations is studied in Section 3.2.We begin by considering the case of conditional heteroscedasticity first, in Section 3.2.1;we then extend these results to also accommodate for unconditional heteroscedasticity in Section 3.2.2.Presenting results by relaxing one assumption at a time should make the presentation more transparent and easier to follow.However, as far as implementation is concerned, the most comprehensive case is the one studied in Section 3.2.2,where both conditional and unconditional heteroscedasticity are considered simultaneously.

Testing for Changepoint with iid Innovations
We begin by considering the baseline case of iid errors { i,1 , i,2 , −∞ < i < ∞}, and we show our first main theoretical result: the limiting distributions of the weighted CUSUM statistics are the same irrespective of whether y i is stationary or not.
As we will show, the variance of the weak limit of Q N (t) depends on whether y i is stationary or not.Let y i denote the unique, nonanticipative stationary solution of Then the variance of the weak limit of Q N (t) is given by the following expression where a 2 = E ȳ2 0 / 1 + ȳ2 0 .In order to study the case κ > 1 2 , as in Horváth, Miller, and Rice (2020), we need to trim a few values of Q N (t), say r 1 (N) and r 2 (N), from the beginning/end of the sample, respectively, and we define We start with the stationary case −∞ ≤ E ln |β 0 + 0,1 | < 0. In this case, the solution of (1.1) under the null hypothesis is close to y i defined in (3.1).The following (technical) assumption is required-the first part ensures that the denominator of η 2 defined in (3.2) is nonzero with probability 1; the second part ensures that, when E ln Theorem 3.1.We assume that H 0 of (1.2), and Assumptions 2.1, 3.1, and 3.3 are satisfied. where where r N , γ 1 and γ 2 are defined in (3.3), and a 1 (κ) and a 2 (κ) are independent copies of a(κ) = sup 1≤t<∞ t −κ |W(t)|.
Theorem 3.1 states that the limiting distributions of the weighted CUSUM statistics are the same irrespective of whether y i is stationary, explosive or at the boundary: the impact of nonstationarity is only on η 2 .Note that, in the nonstationary case, i,2 does not play any role in the asymptotics of Q N (t).

Changepoint Detection in Non iid Environments
In the previous section we assumed, as is typical in the RCA literature, that the innovations { i,1 , i,2 , 1 ≤ i ≤ N} are iid, which may be an undesirable restriction.In this section, we consider two extensions: we begin with the case of { i,1 , i,2 , 1 ≤ i ≤ N} being dependent and conditionally heteroscedastic, and we then extend our theory to the case where { i,1 , i,2 , 1 ≤ i ≤ N} can also be unconditionally heteroscedastic.

Changepoint Detection in the Presence of Serial Dependence
We consider the following definition of dependence. where Since the seminal work by Wu (2005), decomposable Bernoulli shifts have proven a very convenient way to model and study dependent time series, due to their generality and to the fact that it is easier to verify whether a sequence forms a decomposable Bernoulli shift than for example, verifying mixing conditions.Hörmann (2009) shows that virtually all the most common DGPs used in econometrics and statistics satisfy this condition, including ARMA models, ARCH/GARCH sequences, and many nonlinear time series models.
Let y i denote, as above, the stationary solution of (1.1) under the null hypothesis when −∞ ≤ E ln |β 0 + 0,1 | < 0, and define , and Theorem 3.2.We assume that H 0 of (1.2), and Assumptions 2.1,  3.3, and 3.4 are satisfied, and that E ln |β 0 + 0,1 where {B(t), 0 ≤ t ≤ 1} is a standard Brownian bridge and η is defined in (3.4).(ii) For all x, it holds that (iii) If Assumption 3.2 also holds, then, for all κ > 1/2 Theorem 3.2 states that the limiting distribution of the weighted CUSUM statistics is the same as in Theorem 3.1: the presence of serial dependence is entirely absorbed by the variance of the weak limit of Q N (t), η 2 , defined in (3.4).Technically, we now need to rule out the boundary case E ln |β 0 + 0,1 | = 0.This is due to the fact that we would need an exact (and large enough) rate of divergence for y i as i → ∞, but this result is not available in the case E ln |β 0 + 0,1 | = 0 (see also Theorem 4 in Horváth and Trapani 2019, and the discussion thereafter).
Since the impact of stationarity/non-stationarity, and of serial dependence, is only on η 2 , we propose the following weighted-sum-of-covariances estimator, which is consistent for all cases where O N −δ for some δ, δ > 0 in (3.6).Then, the results of Theorem 3.2 remain true if η is replaced with η N .
Corollary 3.1 states that the feasible versions of our test statistics, based on η 2 N , have the same distribution as the infeasible ones, based on η 2 (note that the case of iid innovations is also encompassed by Corollary 3.1).Practically, this means that the test statistics developed above can be implemented with no prior knowledge as to whether y i is stationary or not, and under a very general form of serial dependence.

Changepoint Detection in the Presence of Shifts in Variance
We extend the results above to allow for unconditional heteroscedasticity, in both i,1 and i,2 .Heteroscedasticity arising from shifts in the variances of the innovations is particularly interesting and challenging in the RCA case: if the distribution of i,1 is allowed to change, the observations might change from stationarity to nonstationarity even if β 0 does not change.
We assume that the innovations i,1 and i,2 are piecewise homoscedastic, and that changes in their distribution occur at times Henceforth, we will use the notation: m 0 = 0, m M+1 = N, τ 0 = 0 and τ M+1 = 1.For each subsequence {y i , m −1 < i ≤ m }, 1 ≤ ≤ M + 1, the condition for stationarity can be satisfied; in this case, the elements of this subsequence can be approximated with stationary variables {ȳ ,j , −∞ < j < ∞} defined by the recursion m ] are independent and identically distributed copies of m ,1 .The ,j,2 's are defined in the same way.
To allow for changes in the variances of the errors, we replace Assumptions 3.1-3.3with Assumption 3.6.It holds that: By Assumption 3.6, the WLS estimator may have different variances in the various regimes.In order to study the limit theory, consider the following quantities, defined for 1 ≤ ≤ M + 1: and (3.9) and finally With the quantities defined above, we can now define the zero mean Gaussian process ).This will replace the Wiener processes in the weak limits of the weighted CUSUM functionals.
We begin by investigating how Theorem 3.1 changes in the presence of shifts in variance.
Theorem 3.3.We assume that H 0 of (1.2), Assumptions 2.1, 3.5, and 3.6 hold, and that E ln (ii) For all x, it holds that (iii) If Assumption 3.2 is also satisfied, then it holds that, for all κ > 1/2 Theorem 3.3 is only of theoretical interest, but it shows that, in the case of shifts in variance, the impact on the variance of the weak limit of Q N (t) is more complicated than in the previous cases.This is especially evident in part (i) of the theorem: in that case, the limiting distribution of the weighted Q N (t) is given by a Gaussian process with covariance kernel η 0 (t, s) . Such a process may be viewed as the "bridge" of the Gaussian (but non-Wiener) process (t) defined above.
Parts (ii)-(iii) of the theorem are the same as in the case of homoscedasticity.Upon inspecting the proof, in these cases the asymptotic distribution is driven only by the observations which are as close to sample endpoints as o (N), and (3.10) ensures that the asymptotic variance η 0 (t, t) is proportional to t(1 − t) on such intervals.Finally, as in the previous cases, note that heteroscedasticity in i,2 does not play a role in the nonstationary case.
By Theorem 3.3, the implementation of tests based on Q N (t) requires an estimate of η 0 (t, t).However, this is fraught with difficulties, since it requires knowledge of the different regime dates, m .Thus, we consider a modification of Q N (t) to reflect the possible changes in the variances of the errors.Let We then define the modified test statistic Under the null of no change, the same arguments as in the proof of Corollary 3.1 guarantee that c N,1 (t) and c N,2 (t) converge to the functions and c 2 (t) = c 1 (1) − c 1 (t), for 0 ≤ t ≤ 1, where τ is defined in Assumption 3.5 , and a , 1 ≤ ≤ M + 1, is defined in (3.9).In order to present our main results, we define the zero mean Gaussian process with η 2 j defined in (3.8).The process (t) replaces the Wiener processes in the weak limits of the weighted CUSUM functionals studied in the homoscedastic case (and it replaces (t) defined above).Hence, we define the "bridge" version of (t) 1). (3.16) Theorem 3.4.We assume that H 0 of (1.2), Assumptions 2.1, 3.5, and 3.6 hold, and that E ln where { (t), 0 ≤ t ≤ 1} is the Gaussian process defined in (3.15).(ii) For all x (iii) If Assumption 3.2 also holds, then, for all κ > 1/2 As in the previous theorem, and for the same reason, parts (ii)-(iii) of the theorem are the same as in the case of homoscedasticity.However, from a practical point of view, in order to use parts (ii) and (iii) of the theorem an estimate of g(t, t) is required.To this end, we use c N,1 (t) defined in ( 3.11) instead of c 1 (t); we estimate b(t, s) as where b (j) , and z i is defined in (3.7).Then we can define The implementation of part (i) of Theorem 3.4 is more complicated, since nuisance parameters are not relegated to a multiplicative function.We reject the null in (1.2) if Computing the covariance functions, one can verify that , where {W(x), 0 ≤ x < ∞} is a Wiener process.In order to approximate the critical values, one can simulate independent Wiener processes W i (x), 1 ≤ i ≤ L, and compute the empirical distribution function Corollary 3.2.We assume that H 0 of (1.2) holds.Under the same assumptions as Theorem 3.4(i), it holds that lim min(N,L)→∞ The results of Theorem 3.4(ii)-(iii) remain true if g(t, t) is replaced with g(t, t).

RCA Models with Deterministics
Equation (1.1) does not contain any intercept and therefore, at least in the case of stationarity, it assumes that the observations have zero mean.Although the literature typically does not consider the presence of deterministics in the context of an RCA model (the paper by Hill and Peng 2014, is an exception, albeit only for the stationary case), allowing for a nonzero constant in (1.1) may be desirable in several empirical applications-a leading example could be testing for changes in the persistence coefficient of inflation (see e.g., Benati and Kapetanios 2003, and the references therein).Hence, we discuss how to modify our tests in the presence of a nonzero intercept in (1.1), viz.
(3.20) Our theory can be applied even in this case, by suitably modifying the WLS estimator of β 0 (and by modifying the estimator of the long-run variance accordingly).To save space, we report the expressions of the partial sums WLS estimators of μ and of β 0 in Section A.2 in the supplementary materials-see Equations (A.2)-(A.3)and (A.4)-(A.5),respectively.In the context of (3.20) we use the functional (3.21)We show that the weak limit of Q N (t) has variance given by .22) where a μ,2 and a μ,1,1 , a μ,1,2 are defined in Equations (E.64) and (E.65)-(E.66) in the supplementary materials, respectively.Whilst η 2 μ is quite different from η 2 in the case of stationarity, it is exactly the same under nonstationarity.In essence, this is a consequence of the fact that the WLS estimator of β 0 , in the explosive regime, is asymptotically the same as in the case of no constant.
We modify Assumption 3.3(i) to ensure that the denominators of β k,1 and β k,2 are nonzero with probability 1 (see also Assumption 2 in Horváth and Trapani 2019).
Theorem 3.5.We assume that Assumptions 2.1, 3.1, 3.3(ii), and 3.7 are satisfied, and that E ln |β 0 + 0,1 | = 0.Then, under the null hypothesis of no change, the same results as in Theorem 3.1 and Corollary 3.1 hold, with Q N (t) replaced by Q N (t) and using η 2 μ,N , defined in Equation (A.6) in the supplementary materials, as an estimator of η 2 μ .
Theorem 3.5 states that, upon modifying the WLS estimator to take into account the presence of a constant, results are the same as in Theorem 3.1-and, therefore, the same critical values can be used for the test statistics.The theorem (as all the results in Section A.2 in the supplementary materials) is stated for the case of iid innovations, but extenstions to the case of heteroscedasticity can be studied exactly in the same way as in Sections 3.2.1 and 3.2.2.
The results are qualitatively different depending on whether y i is stationary or not.In the case of stationarity, the presence of a nonzero intercept is absorbed by the variance of the weak limit of Q N (t), which differs from that of Q N (t); conversely, in the case of nonstationarity results are exactly the same as in the previous sections.In both cases, it is necessary to have an estimator of η 2 μ which is consistent irrespective of whether y i is stationary or not.As shown in Equation (A.6) in the supplementary materials, this is the same as in (3.5)-(3.6); the only difference is that it is based on the residuals z μ,i defined in (A.7), instead of z i .
We note that in the nonstationary case it is not possible to estimate μ consistently (see Equation (E.67)); this is due to the fact that, since y i → ∞ a.s., μ cannot be identified (see also, for a related result, Lemma 10 in Horváth and Trapani 2019 ).Hence, in the nonstationary case it is impossible to conduct inference on μ.However, Theorem 3.5 shows that this is inconsequential when testing for a change in β 0 .
Finally, an important related issue is how to test for a change in β 0 taking into account the possible presence of a break in the mean μ .Theorem 3.5 indicates that this is not an issue when y i is nonstationary; conversely, when y i is stationary, the presence of an unaccounted break in μ may result in a spurious detection of a break in β 0 (see the seminal paper by Perron 1989).In Section A.2.1 in the supplementary materials, we briefly address this issue.

Consistency versus Alternatives
In this section, we study the consistency of our test statistics against the AMOC alternative defined in (1.3)-in Section 4.1.We then discuss the case of multiple breaks, providing an estimator for the number of changepoints-in Section 4.2.

Consistency versus the AMOC Alternative
Recall the AMOC alternative of (1.3) Define the break magnitude N = |β 0 − β A |, and t * as Nt * = k * .
Theorem 4.1.We assume that H 0 of (1.2) holds.Under the same assumptions as Theorem 3.4, if, as The theorem ensures that, as long as (4.1)-(4.3)hold, our tests reject the null with probability (asymptotically) 1. Conditions (4.1)-(4.3)essentially state that breaks will be detected as long as they are "not too small, " and "not too close" to the endpoints of the sample.
Consider (4.1).This condition can be understood by analyzing two cases.First, when k * /N → c ∈ (0, 1), it is required that N 1/2 N → ∞: this entails that β A may depend on the sample size N, so that even small changes in the regression parameter are allowed.When N > 0, (4.1) holds as long Turning to (4.2), when k * /N → c > 0, the test is powerful as long as (N/ ln ln N) 1/2 N → ∞: again small changes are allowed for, but these are now "less small" by a O (ln ln N) factor.Conversely, when N > 0, (4.1) holds as long as k * (ln ln N) −1/2 → ∞: breaks that are as close as O √ ln ln N periods to the sample endpoints can be detected.This effect is reinforced in the case of Rényi statistics, where, on account of (4.3), the only requirement is that k * > r N .

Inference under Multiple Breaks
Although (1.3) entertains the possibility of only one change, it is possible to extend our analysis to the case of the alternative of R changes: where 1 ≤ ≤ R + 1, with the convention that k 0 = 0 and k R+1 = N.We require In Section A.4 in the supplementary materials, we discuss an analogue to Theorem 4.1 under (4.4).
Here, we study the estimation of the number of changes R. To this end, several methodologies could be employed, such as model selection based techniques (see e.g., Yao 1988) or sequential testing (see e.g., Bai and Perron 1998).Here, we focus on the segmentation method proposed by Vostrikova (1981), based on a sequential application of the maximally selected (weighted) CUSUM process.
The binary segmentation algorithm can be described as follows.If a break is detected, the sample is split at the location of the maximum of the CUSUM process; the test is subsequently applied to both subsamples, and so on until no breaks are detected within the resulting subsamples.We denote the estimated number of changepoints as R. Horváth and Rice (2021) show that when R is obtained using the unweighted CUSUM process, it holds that lim N→∞ P R ≥ R = 1-that is, R never understates, but can overstate, R-and provide an (extreme) example where R is proportional to the sample size N.
As our next theorem shows, when using weighted CUSUM statistics, R is estimated consistently.In particular, we consider segmentation based on using the functional Q N (t) defined in ( 3.12), with weights Theorem 4.2.We assume that N 1/2 N, → ∞ for all 1 ≤ ≤ R + 1.Then, under the Assumptions of Theorem 3.4, it holds that lim N→∞ P R = R = 1 for all κ ∈ (0, 1/2]. A similar result, in the context of testing for a change in the mean and for κ = 1/2, is in Venkatraman (1992).Theorem 4.2 extends this to the case κ ∈ (0, 1/2).In essence, weighted CUSUM statistics consistently estimate R irrespective of the choice of κ , as long as κ > 0.

Simulations
We provide some Monte Carlo evidence on the performance of the test statistics proposed above; further results are reported in Section B in the supplementary materials.
Data are generated using (1.1).We run experiments with β 0 = 0.5 and 1.05 to consider both the cases of stationary and nonstationary y i ; we note that we have also tried different values of β 0 , but results are essentially the same.We simulate the shocks i,1 and i,2 as independent of one another and following a GARCH(1,1) specification i,j = σ j χ i,j , (5.1) for j = 1, 2, with χ i,j iidN (0, 1); as in Horváth et al. (2020), in our main experiments we use α 1 = 0.25 and β 1 = 0.5.We report results for ω 1 = 0.01 and ω 2 = 0.5-the value of ω 1 is based on "typical" values as found for example, in the empirical applications in Horváth and Trapani (2019).We note however that, in unreported simulations using different values of ω 1 and ω 2 , the main results do not change, except for the (expected) fact that tests have better properties (in terms of size and power) for smaller values of ω 2 .Similarly, the test performs better (with empirical rejection frequencies closer to their nominal value) when ω 1 is larger, and tends to be undersized for smaller values of ω 2 .The effect of ω 1 and ω 2 vanishes as N increases.When allowing for unconditional heteroscedasticity, we use ω 1 = 0.01 and ω 2 = 0.5 for 1 ≤ i ≤ N/2, and then 1.5ω 1 and 1.5ω 2 for N/2 + 1 ≤ i ≤ N. When using long-run variance estimators as in (3.6) and (3.17), we select the bandwidth as H = N 1/4 ; in general, results are not very sensitive to this choice, although we note that smaller values of H lead to oversized tests.We note that our tests tend to have the correct size across all experiments.However, as expected, higher power is attained when using, at each 2 ≤ k ≤ N, the residuals Hence, we use these in our experiments, and recommend doing this in empirical applications.
Under the alternative, we consider both a mid-sample and an end-of-sample break (5.3) (5.4) Finally, we generate N + 1000 values of y i from (1.1)-with y 0 = 0-and discard the first 1000 values.All our routines are based on 2000 replications, and we use critical values corresponding to a nominal level equal to 5%-hence, empirical rejection frequencies under the null have a 95% confidence interval [0.04, 0.06].
All results reported in this section are based on using Q N (t) defined in (3.12) and the long-run variance estimator (3.18); as noted in Section 3.2.2, this is the most comprehensive case. 3 We have used asymptotic critical values for Rényi statistics, and the method described in Section 3.2.2(see Equation (3.19 )) for the cases where κ < 0.5. 4 When using the Darling-Erdő s statistic (κ = 0.5), asymptotic critical values yield hugely undersized tests; we have therefore used the critical values in Table I in Gombay and Horváth (1996).
Empirical rejection frequencies under the null are reported in Table 1, for two scenarios-the case of only GARCH(1,1) errors, and the case of both GARCH(1,1) errors and unconditional heteroscedasticity.Broadly speaking, our methodology works well in all cases considered.Tests have a tendency to be mildly oversized when y i is stationary, and vice versa to be undersized in the presence of nonstationarity; although this is not the case with iid observations, it becomes more and more pronounced with serial dependence, unconditional heteroscedasticity, and when using demeaning.As N increases, however, this vanishes and the empirical rejection frequencies lie within their 95% confidence interval in virtually all cases considered.
The empirical power of the tests in the presence of a midsample break is reported in Figure 1, where we only consider a sample size of N = 400 to save space, and we consider the case of a GARCH(1,1) structure, and shifts in variance in both i,1 and i,2 .In the supplementary materials (see Figures B.1-B.3), we report a further set of results where we consider, separately, the iid case, and the cases of GARCH(1,1) errors and shifts in variance only in i,1 and only in i,2 .The figures illustrate the robustness of our approach, showing, essentially, the same pattern: tests work well in all cases considered, with the power increasing monotonically in .Test statistics with lower κ exhibit more power versus alternatives with "small" values of : in this case, the power decreases in κ, with virtually no exceptions.Comparing Figure 1 with the results in the Supplement, it emerges that the power of the test is virtually unaffected by the presence or absence of (conditional and unconditional) heteroscedasticity in i,2 , whereas heteroscedasticity in i,1 does have an impact.This is particularly apparent in the nonstationary case β = In the first panel of the table, we consider a GARCH(1,1) structure in both i,1 and i,2 with no shifts in variance.In the second panel of the table, we consider the case of a GARCH(1,1) structure in both i,1 and i,2 and shifts in the variances of both innovations.
1.05, and it can be explained by noting that, in such a case, the asymptotics of the WLS estimator is driven only by i,1 .We also consider the case of end-of-sample breaks (5.4).Results are in Figure 2, where the settings are exactly the same as above; again, in the supplementary materials (see ), we report a further set of results for the cases of GARCH(1,1) errors, and shifts in variance only in i,1 and only in i,2 .The results show a similar pattern as above.However, the impact of κ here is, as expected, completely reversed: the power versus breaks that occur at the end of the sample increases monotonically, ceteris paribus, with κ.This makes a difference particularly in the case of medium-sized changes-for example, when = 0.35, increases in power from κ = 0 to κ = 1 are in the region of 20%-25%.
Finally, in Figures B.7-B.10 in the supplementary materials, we report a small scale exercise where we evaluate the empirical rejection frequencies when β 0 is close to unity.We consider the simultaneous presence of conditional and unconditional heteroscedasticity in both i,1 and i,2 : results for other cases are available upon request and, in general, no major differences emerge.These "boundary" cases should be helpful to shed more light on the performance of our procedure when detecting changes from stationarity to nonstationarity (when β 0 < 1 and changes are positive), and vice versa (when β 0 > 1 and changes are negative).The main message of Figures B.7-B.10 is that our tests are very effective in these boundary cases.The power is especially high when β 0 > 1-that is, when the RCA process changes from an explosive to a stationary behavior, which may find a possible application when testing for the collapse of a bubble in financial econometrics applications.

Empirical Applications
We illustrate our approach through an application to four time series: house prices in two major U.S. cities (Boston and Los Angeles); an IT index (the Nasdaq); and a cryptocurrency index based on the largest cryptocurrencies traded in USD (the BGCI).All these series lend themselves to applying a test for changepoints, and, in particular, to search for episodes of explosive growth (thus, indicating the possible inception of a bubble), followed possibly by periods of more tame dynamics (thus, indicating that the bubble has ended).House prices have been studied extensively in the context of detecting the beginning/end of a bubble, and we refer to Horváth, Liu, and Lu (2021), and the references therein, for a recent example.Case and Shiller (2003) and Shiller (2008) provide a detailed analysis of the U.S. real estate market peaks, explaining such peaks as an example of an "amplification mechanism" whose source is the expectation that future housing price will increase, thus, suggesting that home buyers view real estate as an investment opportunity.The Nasdaq is well-known to have experienced an irrational speculative bubble, that is, a bubble that was primarily driven by "irrational exhuberance" based on psychological factors, herd instincts, etc-the so-called "dot.combubble." Finally, cryptocurrencies have been extremely volatile during the past few years, which suggests that they are speculative in nature; ex-ante or ex-post detection of breaks in cryptocurrencies has been the subject of several recent studies, and we refer, inter alia , to Hafner (2020) and Astill et al. (2021), and the references therein.
We have used the following datasets, all at daily frequency: housing prices in Boston andLos Angeles between January 4th, 1995, and23rd October, 2012 (with N = 4645); the Nasdaq between November 2nd, 1994 and June 9th, 2006, to cover the dot-com bubble period (with N = 2922); and the BGCI between August 2nd, 2017 and April 28th, 2022 (with N = 1237).5Further details, graphs and descriptive statistics are in Section C in the supplementary materials.We use the logs of each series, with no further pre-processing.We use weighted functionals of Q N (t) defined in (3.21), thus, allowing for the possible presence of deterministics; hence, we standardize Q N (t) by g(t, t) defined in (3.18), constructed using the demeaned residuals defined in (A.7).Thus, our tests are, by construction, robust to the simultaneous presence of conditional and unconditional heteroscedasticity.
As can be noted from Table 2, all series in all regimes are found to be nonstationary, and the null of no randomness of the test by Horváth and Trapani (2019) is always rejected.The former finding indicates that breaks are genuinely due to changes in the persistence coefficient of the data, that consistent estimation of μ is not possible, and that the weak limit of Q N (t) is the same as that of Q N (t)-indeed, we have also tried to use the latter, obtaining the same results.
Boston and Los Angeles, following a similar analysis (albeit carried out with monthly data and a different time span) in Horváth, Liu, and Lu (2021).For completeness, we consider eight further cities in Section C in the supplementary materials.
Considering the housing market, our findings-both in terms of break dates and changes in the persistence parameterare highly suggestive.In particular, house prices in Los Angeles experienced three episodes of breaks.The first one is found in early 2002, with an increase in the autoregressive parameter which confirms the so-called "amplification mechanism" advocated by Case and Shiller (2003), and which indicates the presence and extent of the U.S. housing bubble in the early 2000s.The second changepoint, with a decrease in the autoregressive coefficient, occurred in Spring 2006, which corresponds to the period of the U.S. housing market correction, when prices began to stall.Finally, after a period of "hard landing, " prices are found to have stabilised around the third breakdate, in early 2009.On the other hand, the housing market in Boston exhibits a different dynamics.We find evidence of only one changepoint in early July 2005, after which no further breaks were detected.Turning to the Nasdaq, our tests identify three changepoints (see also Figure C.6 in the supplementary materials).The first one, in October 1998, may be interpreted as the beginning of the dot-com bubble (which is usually dated in this period and, according to the test, started after the Russian stock market crash); the second one, in March 2000, is the "common wisdom" estimate of the time when the bubble burst, after the peak of March 10th; and-albeit with less evidence-a third one in October 2002, which is the date of the stock market downturn of 2002.We also note that we did not find any changes prior to 1998; this may be explained as a consequence of the fact that our dataset starts when the bubble was already brewing (see e.g., Horváth, Li, and Liu 2022, who find a break in January 1995), or of the fact that similar findings in the literature have been obtained under the assumption of homoscedasticity.The values of β 0 may shed some light on the presence of explosive regimes.
In particular, during the bubble period, β 0 was found to be above 1.Conversely, the period after the bubble crash is characterised by a value of β 0 below 1; although the data are found to be nonstationary even in that period, this indicates how severe the market correction was.
Finally, the cryptocurrency index-despite being the shortest series-appears to be the one with the largest number of breaks, and also the one with the largest values of β 0 in some regimes.This is particularly true during the first period, covering the second half of 2017.Astill et al. (2021) carry out a similar exercise, and in a similar time window, in the context of sequential monitoring of Bitcoin prices, and find possible evidence of explosive behavior during summer 2017, which we do not find even when using the Rényi-type statistics.However, the authors also find evidence of heteroscedasticity, stating that it "is [...] of considerable importance to allow for the presence of time-varying volatility in the data when investigating whether or not the general upward movement in the Bitcoin price series is due to explosive episodes, " concluding that the detection of an explosive episode may be spurious.Our tests are robust to both conditional and unconditional volatility, and therefore they lend themselves to being applied to this dataset; the fact that The table contains results obtained using the weighted maximally selected CUSUM process with κ = 0.25, using the same specifications as discussed in Section 5; in particular, we have estimated long-run variances-using demeaned residuals-with bandwidth H = N 1/4 .We have tried also κ ∈ 0, 0.5, 0.75, 1, again using the same specifications as discussed in Section 5, obtaining exactly the same results which are available upon request.All changepoints have been detected using a nominal level of 5%, with the exception of Regime 3 for the Nasdaq, which is found at 10% level but not at 5% level.
The estimated number of regimes, R, has been obtained using the binary segmentation algorithm discussed in Section 4.2, again with κ = 0.25.We note that the same results have been obtained with κ = 0.5, and also with the other values of κ.
For each regime, we report two estimates.β 0 is the WLS estimator of β 0 within each regime, and the number in square bracket is the value taken by the randomized confidence function for the test for the null of strict stationarity proposed in Trapani (2021)-we refer to Section C in the supplementary materials-and specifically to Equations (C.3) and (C.4)-for a description of this quantity and of its usage, noting that, across all results, the null of strict stationarity is always rejected at 5% nominal level.σ 1 is the WLS estimate of the variance of i,1 in each regime, studied in Horváth and Trapani (2019), and the number in square brackets is the value taken by the randomized confidence function for the test for the null that σ 1 = 0 proposed in Horváth and Trapani (2019).A description of the procedures, and threshold values for the randomized confidence function, are in Section C in the Supplement (threshold values are in Table C.1).We note that, across all results, the null of no randomness is always rejected at 5% nominal level.Finally, we have applied the test for equality of β 0 across two adjacent regimes discussed in Section C of the supplementary materials; in all cases, the null is rejected at 5% nominal level, further reinforcing the conclusion that changes in β 0 are genuine.
we do not find an episode of explosive dynamics until January 2018 can be read in conjunction with the comments above.Interestingly, we do not find any changes in persistence after May 2021, not even when using heavily weighted Rényi-type statistics (with κ = 1) at a 10% nominal level.Although looking at a graph of the data may suggest the presence of a break around the September 2021 peak (see Figure C.7 in the supplementary materials), it is also possible that the behavior of the index in the terminal period of the sample is-again-driven purely by changes in the volatility of the series.

Discussion and Conclusions
In this article, we study changepoint detection in the deterministic part of the autoregression coefficient of a Random Coefficient Autoregressive model, using the CUSUM process based on comparing the left and right WLS estimators.In order to be able to detect changepoints close to the sample endpoints, we study weighted statistics, where more weight is placed at the sample endpoints.We consider a very wide class of weighing functions, studying: (i) weighing schemes based on the functions w (t), which drift to zero, at sample endpoints, more slowly than (t (1 − t)) 1/2 ; (ii) standardized statistics, with weighting (t (1 − t)) 1/2 ; and (iii) Rényi statistics, where heavier weights are used.We also show that weighted statistics have the advan-tage of ensuring that, in the presence of multiple changepoints, binary segmentation estimates consistently the number of such changepoints (contrary to unweighted statistics).From a practical point of view, our tests can be applied in the presence of both conditional and unconditional heteroscedasticity (requiring no knowledge as to the actual presence, or the form, thereof).Moreover, from a technical point of view, all our results are based on a (strong) approximation of the weighted maximum of partial sums.We have developed these both in the stationary and in the nonstationary case: in the latter case, our approximations are entirely novel, and yield the same results as in the stationary case.This, too, has important practical implications: our tests can be applied with no prior knowledge as to the stationarity or lack thereof of the data.Our simulations show, in general, good finite sample properties; we note that dealing with heteroscedasticity using the approach suggested in Perron and Yamamoto (2022) would naturally result in power gains, and such an extension is a high priority in the authors' research agenda.
In conclusion, the robustness to stationarity/non-stationarity of our approach reinforces the case made by Aue and Horváth (2011) for RCA models, where the authors advocate that "[...] it would be worthwhile to give RCA models a closer look and [...] they are worthy of inclusion in the applied statistician's toolbox, " using the RCA as an alternative to the linear AR model, which does not possess the same property and may require differencing, with the well-known problems attached to this transformation (Leybourne, McCabe, and Tremayne 1996).Further, as noted in Giraitis, Kapetanios, and Yates (2014), a model with random coefficients may be viewed as a natural alternative to a linear model with a break in the autoregressive root; this issue is investigated systematically in Elliott and Müller (2006).We have based our analysis on adopting a general set-up like the RCA with possible heteroscedasticity from the outset, subsequently checking whether, in addition to small fluctuations in the persistence coefficient, there are also larger, more abrupt, changes, whose consequence may even be a change from stationarity to non-stationarity.Whilst this set-up is very general, a natural question is whether the RCA model is an adequate approximation after accounting for regime shifts, or whether a standard AR model with breaks is better.In principle, it would be possible to investigate this, at least under stationarity.A possible approach in this case would be based on testing for breaks in β 0 and segmenting the observations, then testing for breaks in var 0,1 within each segment, and further segmenting the observations into regimes where both β 0 and var 0,1 are constant, and finally applying the tests in Horváth and Trapani (2019) to each segment, checking whether the randomness of the autoregressive root is present or not.This important extension is under investigation by the authors.
Figure C.1, and the estimates of β 0 in the period before 2005, indicate that, prior to the market adjustment, the housing market in Los Angeles was undergoing faster price changes than Boston.Our results also confirm the analysis in Horváth, Liu, and Lu (2021), where monthly data based on the S&P CoreLogic Case-Shiller Home Price Indices show that

Table 1 .
Empirical rejection frequencies.The table contains the empirical rejection frequencies under the null of no changepoint for different sample sizes and different values of κ.In both panels, the methodology in Section 3.2.2 has been applied.

Table 2 .
Summary of empirical findings.