Endogeneity in semiparametric threshold regression models with two threshold variables

ABSTRACT This article considers a semiparametric threshold regression model with two threshold variables. The proposed model allows endogenous threshold variables and endogenous slope regressors. Under the diminishing threshold effects framework, we derive consistency and asymptotic results of our proposed estimator for weakly dependent data. We study the finite sample performance of our proposed estimator via small Monte Carlo simulations and apply our model to classify economic growth regimes based on both national public debt and national external debt.


Introduction
One of the most interesting forms of non linear regression models with wide applications in economics is the threshold regression model, where the sample split value (or the so-called threshold parameter) is unknown. That is, the model internally sorts the data on the basis of some threshold determinant, into groups of observations each of which obeys the same model. The use of threshold regression provides a parsimonious way of introducing a non linear structure into a regression model, and it also allows for increased flexibility in regression functional form, while at the same time it is not as susceptible to the curse of dimensionality problems.
Threshold regression models have been used extensively in applied work in the last 20 years. The limiting theory for estimation and inference in the first generation threshold regression models, which assumes an exogenous threshold variable, has been well developed (see, e.g., Chan, 1993;Hansen, 2000;Caner and Hansen, 2004;Seo and Linton, 2007;Gonzalo and Wolf, 2005;Yu, 2012Yu, , 2015. Among others, Chen et al. (2012) extend the analysis from one to two separate exogeneous thresholds within a parametric autoregressive model. However, the exogeneity assumption on the threshold variable(s) dramatically limits the usefulness of the first generation threshold regression model.
Many practical issues demand a method that can identify the underlying mechanisms under endogenous threshold effects. Henceforth, there is a growing body of literature in threshold regressions that allow for endogenous threshold variables. Generally, the current literature adopts three different approaches to deal with the endogeneity of the threshold variable. The first approach employs a control function method. Borrowing from the idea of Heckman's sample selection method, Kourtellos et al. (2016) propose estimation and inference for a threshold regression model that accounts for an endogenous threshold variable and for endogenous regressors, where the control function is included in the threshold regression model under certain parametric assumptions. Christopoulos et al. (2021) apply copulas to control for the endogeneity bias of the threshold variable. Yu et al. (2020) extend Kourtellos et al. (2016)'s method by assuming the endogeneity is controlled over a function with a known form up to a finite order, while Kourtellos et al. (2022) further extend Yu et al. (2020) approach by allowing for the control function to be of an unknown form and with an infinite order, still though in a framework of a single threshold variable. The second approach is motivated by the generalized method of moments (GMM). Seo and Shin (2016) propose a first-difference (FD) GMM estimator for a dynamic threshold panel data model, which allows both regressors and threshold effect to be endogenous. In particular, the sample averaging nature of GMM allows to bypass the issue of the model's discontinuity. Consequently, the asymptotic distribution of the FD-GMM estimator is normal. Notably, there is an inherent difference between the control function approach and GMM as the former relies on conditional moments, whereas GMM uses unconditional moments. Yet, the GMM method has a slower convergence rate than the other two methods and is quite sensitive to the choice of instruments. As a result, it generally displays a poor small sample performance. The third approach employs the local shifter, the indicator function, to identify the true threshold, but it is only applicable for an estimation method that uses local information. This approach is studied in the non stationary and non parametric regression with endogeneity (e.g., Phillips, 2009, 2014;Phillips and Su, 2011) and recently used in the threshold regression by Yu and Phillips (2018) where they propose a non parametric threshold model with enodogeneity and an integrated difference kernel estimator. Under the fixed threshold effect framework of Chan (1993), they show that the threshold parameter can be identified and estimated without any instruments at the rate n. They also show that, while instrumental variables are not necessary for identifying and estimating the threshold effect parameters, regime-specific regression coefficients can only be identified and estimated at the usual root-n rate when instrumental variables are available. Nevertheless, the i.i.d. sample significantly limits the extent of appropriate applications.
In this article, we extend the analysis of Kourtellos et al. (2022) to allow for two endogenous thresholds, extending Chen et al. (2012) in two important ways: allowing for endogeneity of the thresholds and not relying on specific distributional assumptions by employing a non parametric control function approach that allows for the endogeneity in all threshold variables. We provide an estimation strategy of the least squares method using a series approximation based on polynomials. We derive the asymptotic distribution of our proposed estimators for both the threshold and slope parameters. We evaluate the finite sample performance of our proposed estimators through Monte Carlo simulations. Finally, our model is applied to identify the different regimes of economic growth classified by public debt and external debt.
The remainder of the article is organized as follows. In Section 2, we present a benchmark version of the semiparametric threshold regression model with exogenous regressors and two endogenous threshold variables and derive the asymptotic results of the proposed estimator. In Section 3, we extend the benchmark model to allow for endogeneity of the regressors as well, and Section 4 provides the simulation results of our proposed estimator. Section 5 presents the empirical application and finally Section 6 concludes the article. Technical proofs are relegated to supporting information Appendices.

Endogenous threshold variables
Consider the following parametric structural threshold regression with two threshold variables: for t = 1, 2, . . . , n, where y t is the dependent variable, x t is a k × 1 vector of exogenous regressors, q t = [q 1t , q 2t ] T are the endogenous threshold variables, γ 0 = [γ 0 1 , γ 0 2 ] T are the true threshold levels, and for j = 1, 2, 3, 4, β j is the regime specific coefficients and u jt is a regime-dependent error with zero mean and a strictly positive variance σ 2 j . The reduced-form model for q t is given by We assume that E v q 1t |F n,t = E v q 2t |F n,t = 0 holds almost surely for all t = 1, . . . , n, which means that v q 1t and v q 2t are uncorrelated to x t , z t , q 1,t−1 , q 2,t−1 , u t−1 and their historical values.
The endogeneity of q t arises from the contemporaneous correlation between the regression error, u jt , in model (2.1) and (v q 1t , v q 2t ). Specifically, we assume that E u jt |F n,t , q 1t , q 2t = E u jt |v q 1t , v q 2t = h j v q 1t , v q 2t holds almost surely for all t = 1, . . . , n and j = 1, . . . , 4. It follows that Thus, the integrated form of our semiparametric threshold regression model is given as , v q 2t for j = 1, . . . , 4, and . . . ,4. Then,model (2) becomes a partially linear additive threshold model with exogenous threshold effects. If one of the threshold variables is endogenous, for example, only q 1t is endogenous, h j (v q 1t , v q 2t ) will reduce to h j (v q 1t ) -a function with a scalar argument. When q 1t and q 2t are both exogenous, h j (·, ·) ≡ 0 so that the control functions, h j (·, ·), are omitted for all j = 1, . . . , 4 and model (2) reduces to the standard threshold regression models with two exogenous threshold variables as in Chen et al. (2012). Let Then, we can rewrite model (2) as (2.5) Following Hansen (2000) and Chen et al. (2012), our limiting results are derived under the diminishing threshold effects framework. We assume that the exogenous threshold effects, δ (i) = δ (i) n , and the endogenous threshold bias correction terms, η (i) = η (i) n , both slowly approach to zero as n increases. We make the following assumptions to support model (2.5).
o (w) = 0 over at least one non empty interval for all j = 2, 3, 4.
This set of assumptions is commonly imposed in threshold regression and semiparametric endogenous regression framework; see, e.g., Assumption 1 in Hansen (2000) and Kourtellos et al. (2022). Assumptions 1 (a)-(b) exclude trend stationary and integrated process. Assumptions 1 (a) and (c) state that both x t and z t are exogenous variables, while q t is contemporaneously endogenous, which implies that q t cannot be part of x t in this section. Assumption 1 (c) also ensures ε t , F * n,t n t=1 with F * n,t = σ (x t , z t , q 1,t−1 , q 2,t−1 , u t−1 , . . .) is a martingale difference sequence with E(ε t |F * n,t ) = 0 almost surely for all t. Assumption 1 (d) is used for the purpose of identification due to the unknown nature of the function h j (·, ·), for j = 1., 4. Assumption 1 (e) assumes a diminishing threshold framework, where the exogenous threshold effects and the endogenous threshold bias correction terms are allowed to vanish at different speed.
Step 1: Estimating model (2) by the OLS method, we calculate the fitted values of q 1t and q 2t and collect the residuals v q 1t and v q 2t .

Limiting results
Following Chen (2007), we assume that W = W 1 × W 2 is the Cartesian product of the compact intervals W 1 and W 2 . Let p = m + ψ, where ψ ∈ (0, 1] and m is a non negative integer, and define a Hölder smoothness class p (W); see e.g., Condition 2 of Stone (1994) and Section 2.3.1 of Chen (2007). For any real-valued m-times continuously differentiable function h(·) on W in p (W) with an exponent ψ, for any w ∈ W andw ∈ W, we have, where D ξ denotes the differential operator: , and |ξ | = ξ 1 + ξ 2 and ξ = (ξ 1 , ξ 2 ) is a 2-tuple non negative integers and M is a positively bounded constant. Following assumptions are used to derive the consistency and limiting distribution of our proposed estimators.
for every L n and uniformly over γ 1 ∈ [γ , γ ] and γ 2 ∈ [γ , γ ], there exist constants M and M such that , and χ * t,γ equals χ t,γ with v q 1t and v q 2t being replaced with v q 1t and v q 2t respectively, where λ min (·) and λ max (·) are the respective minimum and maximum eigenvalue; (c) the threshold variables q 1t and q 2t are strictly stationary with a common continuous joint distribution F(γ 1 , γ 2 ) and a corresponding joint density f (γ 1 , γ 2 ). For j = 1, 2, f j (γ 1 , γ 2 ) = ∂F(γ 1 ,γ 2 ) ∂γ j and 0 < f (γ 1 , γ 2 ) ≤f < ∞ and 0 < f j (γ 1 , γ 2 ) ≤ f j < ∞; for j = 1, 2, 3, 4 and i = 2, 3, 4, h j (·, ·) and η (i) 0 (·, ·) are functions belonging to the p 0 (W) for some p 0 > 2; (d) there exist α L n ,0,i and α L n ,j such that sup w∈W |η (i) 0 (w) − α T L n ,0,i L n (w)| < ML −p n and sup w∈W |h j (w) − α T L n ,j L n (w)| < ML −p n for all j = 1, 2, 3, 4 and i = 2, 3, 4; (e) {φ l (.), l = 1, . . . , L n } is a sequence of orthonormal basis functions in p 0 (W) and uniformly bounded over W; (f)L −p Assumptions 2 (a)-(b) ensure the existence of θ(γ ), which is standard in the literature; see, e.g., Delgado et al. (2019) and Kourtellos et al. (2022). Assumption 2 (c) requires the stationarity of the threshold variables. This assumption assumes the threshold variables are continuous with positive density everywhere, which ensures the sample is dense near the true threshold levels as the sample size increases. Assumptions 2 (d)-(e) are required for the application of nonparametric sieve estimation; see, e.g., Chen (2007). Specifically, Assumption 2 (d) restricts the sieve approximation error for p 0smooth functions, which holds for polynomial, splines, and wavelet series as explained in Chen (2007), where p = p 0 /2 as η (i) 0 (·, ·) and h j (·, ·) are functions with two arguments. If the unknown functions have unbounded support, a trimming function can be introduced to regulate the tail behavior; see, Chen and Christensen (2015). Assumption 2 (e) describes the properties of the basis functions, and implies It is well-known that B-splines and Hermite basis functions satisfy the condition, while power series give L n 0 = O(L n ) and L n 1 = O(L 3 n ). Hence, power series approximation gives slower uniform convergence rate; see, e.g., Newey (1997). Assumption 2 (f) is needed to derive the consistency and asymptotic distribution. Specifically, L −p n + n −1/2 L n 1 + √ L n /n = o(1) and L n 2 1 L n /n = o(1) are needed to derive the consistency results of γ and θ in Theorem 1. n min (α, ) (1) is needed to remove the limiting impact of the estimation of slope parameter θ on deriving the asymptotic distribution for γ in Theorem 2 and further discussion is given below Theorem 1.
√ nL −p n = o(1) is used to vanish the limiting impact of the estimation of γ on the limiting distribution for θ in Theorem 4. Let L n = c 0 n l for some constants c 0 > 0 and l ∈ (0, 1) and p = p 0 /d, where p 0 measures the order of the smoothness and d = 2 represents the dimension of arguments in the unknown endogenous bias correction functions. Then, Assumption 2 (f) implies l ∈ d 2p 0 , min (1/4, 1 − 2min(α, ρ)) and we require p 0 > 1 min(1/4,1−2min(α,ρ)) . Assumption 2 (g) imposes the identification condition for γ 0 1 and γ 0 2 . Note that if the threshold variables are exogenous and there is no regime-dependent heteroskedasticity, Assumption 2 (g) becomes Assumption A7 in Chen et al. (2012).
Theorem 1 shows the consistency of γ and θ . The convergence rate is the same as in Kourtellos et al. (2022). We can decompose the convergence rate into three components: (i) the conventional convergence rate of series estimator, which is in the order of O p (L −p n + √ L n /n); (ii) the first order bias of order O p (n −1/2 l n 1 ) from the estimation of π q 1 and π q 2 ; and (iii) the diminishing speed on both endogenous threshold bias correction terms and the exogenous threshold effect. If we assume the model is exogenous in threshold variable and ignore the nonparametric functions, the convergence rate reduces to O p (n −1/2 + n −α ), which returns to the standard results as in Chen et al. (2012).

Theorem 2. Under Assumptions 1-2, we have
where r = (r 1 , r 2 ), and W i (r i ) is a two-sided Brownian motion on the real line defined as Theorem 2 shows that the asymptotic distribution of the threshold parameter estimator, which is similar to Theorem 1 in Chen et al. (2012). However, there are three main differences. First, since we extend the assumption of "small threshold effect" to the endogenous part, limiting results associate with the diminishing rate for both α (exogenous threshold effect) and (the endogenous threshold bias correction). Second, our limiting results involve the true functions of endogenous threshold bias correction terms, which is regime-specific. The last difference lies in the the error structure. We allow the regime-dependent heteroskedastic errors while Chen et al. (2012) restrict to homoskedastic errors.
To build an intuitive bridge between Theorem 2 and Theorem 1 in Chen et al. (2012), we now assume α < strictly (the endogenous threshold bias correction terms effect shrink to zero faster than the exogenous threshold effect). Thus, the convergence rate of the threshold estimator becomes n 1−2α . In addition, μ 1 and μ 2 reduce to . Note that μ * 1 and μ * 2 are essentially the same as the numerator of λ T defined in Theorem 1 of Chen et al. (2012).
To make inference on the confidence interval of γ 0 = (γ 0 1 , γ 0 2 ), following Hansen (2000) and Kourtellos et al. (2022), we propose to use a likelihood ratio test of the null hypothesis H 0 : γ = γ 0 . The test statistic is . The distribution function of ξ is Theorem 3 states the limiting distribution of the likelihood ratio test for γ . In particular, if α < strictly and error is homoskedastic with finite variance σ 2 , under the null hypothesis, we have , which is free of nuisance parameter.
The following theorem provides the limiting distribution of the parametric slope estimate, β.
where J n is defined by (B.31) in the supporting information Appendix, defined by (B.34) and (B.36) in the supporting information Appendix, and I 4k is a 4k × 4k identity matrix.
Theorem 4 shows the asymptotic results of the estimator of β, which is normally distributed with root-n convergence rate, a standard parametric convergence rate.

Endogenous threshold variable and regressors
This section considers a model with both endogenous slope regressors and endogenous threshold variables. The reduced form model for x t is given by Hence, we can rewrite model (2) as t (γ 0 1 , γ 0 2 ). We make the following assumptions for model (3.2).
Assumption 3 (b)-(c) ensure the contemporaneous exogeneity of z xt . In particular, Assumption 3 (c) holds for both v q t = (v q 1t , v q 2t ) is orthogonal and unorthogonal to v x t . 1 In other words, we allow x t and q t to have common variables. Assumption 3 (d) serves for the identification purpose. Below, we explain the estimation of our proposed estimator.

Limiting results
Following assumptions are used to derive our asymptotic results in this section.

Theorem 6. Under Assumptions 3 and 4, we have
where r = (r 1 , r 2 ), . If one applies the second control function approach (CF-II) of Yu et al. (2020), it results endogenous threshold bias correction terms in the form of h j (v xt , v q 1t , v q 2t ), which can cause more severe curse-of-dimensionality problem than the current method. Note: This table reports the simulation results of the estimates of the true threshold parameters γ 0 1 = γ 0 2 = 1.2. The degree of the endogeneity of the threshold variables, b, is 0.5. Specifically, the top panel, TR, reports the results without controlling for endogeneity. The bottom panel, Semi-TR, reports the results of our proposed semiparametric threshold regression method. We report results with the order of two-dimensional Hermite basis function L n = 3, 4, 5. The first column shows the sample size. The second to the fourth columns report the results of the first threshold estimate, γ 1 . The last three columns show the results of the second threshold estimate, γ 2 . ϕ 1 , ϕ 2 , μ 1 , μ 2 , λ 11 , λ 12 , λ 21 , and λ 22 are defined by (2.8) with

and W i (r i ) is a two-sided Brownian motion on the real line defined as
and ij (r i ), for i = 1, 2, j = 1, 2 are four independent standard Brownian motions on [0, ∞).

Theorem 7. Under Assumptions 3 and 4, we have
where J n is defined by (B.40), and B n , qq , qε , εε are defined by (B.42) and (B.43) in the supporting information Appendix. Note: This table reports the simulation results of the estimates of the true threshold parameters γ 0 1 = γ 0 2 = 1.2. The degree of the endogeneity of the threshold variables, b, is 1. Specifically, the top panel, TR, reports the results without controlling for endogeneity. The bottom panel, Semi-TR, reports the results of our proposed semiparametric threshold regression method. We report results with the order of two-dimensional Hermite basis function L n = 3, 4, 5. The first column shows the sample size. The second to the fourth columns report the results of the first threshold estimate, γ 1 . The last three columns show the results of the second threshold estimate, γ 2 .
Comparing Theorems 5-7 with Theorems 1, 2, and 4, we observe that the endogenity of the regressors only influences the variation of both γ and β, whereas the convergence rates remain the same.

Monte Carlo simulations
To evaluate the finite sample performance of our proposed estimator, we provide Monte Carlo simulation results in this section. We generate our data from the following models: where v q 1t , v q 2t , x t , and t are independently normally distributed with mean zero and variance one, z q 1t , z q 2t are independently normally distributed with mean three and variance one, and b measures the degree of endogeneity of the threshold variables q 1t and q 2t . Note: This table reports the simulation results of the estimates of the true threshold parameters γ 0 1 = γ 0 2 = 1.2. The degree of the endogeneity of the threshold variables, b, is 2. Specifically, the top panel, TR, reports the results without controlling for endogeneity. The bottom panel, Semi-TR, reports the results of our proposed semiparametric threshold regression method. We report results with the order of two-dimensional Hermite basis function L n = 3, 4, 5. The first column shows the sample size. The second to the fourth columns report the results of the first threshold estimate, γ 1 . The last three columns show the results of the second threshold estimate, γ 2 .

Bias and MSE
We begin by assessing the performance of our proposed estimator for the threshold parameters. We consider sample size n = 300, 500, 700, and 1,000 using 1,000 replications. Tables 1-3 report the simulation results of average bias, mean squared error (MSE), and sample standard deviation (Std. Dev.) by varying the degree of endogeneity of the threshold variables b = 0.5, 1, 2, respectively. For each table, we show the simulation results for different orders of bivariate Hermite basis functions. To provide a more in-depth overview of outcomes, we also compare our estimator (Semi-TR) with the standard OLS estimator (TR), which completely ignores the underlying endogeneity.
First, for the MSE, those of Semi-TR generally decrease as the sample size increases for all degrees of endogeneity. However, MSEs of TR decrease at a much slower rate compared to Semi-TR, especially when b is large. As the degree of endogeneity increases, the gap between TR and Semi-TR is also getting more significant. When turning to average bias and sample standard deviation, we observe a very similar pattern that Semi-TR significantly reduces both the bias and the sample standard deviation for both threshold parameter estimates as sample size increases. Next, within each table, we observe that the choice of L n barely affects the simulation results, which is consistent with our theoretical results of Theorem 2, showing the convergence rate of γ only depends on α and .  The bottom panel, Semi-TR, reports the results of our proposed semiparametric threshold regression method. We report results with the order of two-dimensional Hermite basis function L n = 3, 4, 5. The first column shows the sample size. The second to the fourth, the fifth to the seventh, the eighth to the tenth, and the last columns report the estimation results for β 0 1 , β 0 2 , β 0 3 , and β 0 4 , respectively.
Tables 4-6 present the simulation results of the slope parameters for both TR and Semi-TR with b = 0.5, 1, 2, respectively. Similar to the results of the threshold parameters estimators, we observe β 1β 4 are estimated with much larger MSEs for TR than our proposed Semi-TR. In particular, under a relatively high degree of endogeneity, some of the MSEs of TR appear to be prominent even with large sample size, suggesting an asymptotic bias. For example, in Table 6, the MSE of β 1 for TR is 0.1934 when n = 1, 000. In contrast, we observe all MSEs of Semi-TR decrease and approach to zero as the sample size increases for all degrees of endogeneity. Finally, as stated in Theorem 5, the convergence rate of the slope estimator is root-n and is irrelevant to the choice of L n . Indeed, our simulation results are relatively robust to the choice of L n for all b.

Coverage probability and length
We use the proposed likelihood ratio test to explore the coverage probability and length of the confidence intervals (CI) for the threshold parameters. Tables 7 and 8 present the simulation results of the CI coverage and length, respectively. The coverage level is set as 95%. Interestingly, at first glance, the CIs coverage for both TR and Semi-TR is promising. Specifically, as n increases, we observe the coverage rates of both TR and Semi-TR steadily improve irrespective of the value of b. However, the coverage length  We report results with the order of two-dimensional Hermite basis function L n = 3, 4, 5. The first column shows the sample size. The second to the fourth, the fifth to the seventh, the eighth to the tenth, and the last columns report the estimation results for β 0 1 , β 0 2 , β 0 3 , and β 0 4 , respectively.
tells a different story. In Table 8, Semi-TR has a much shorter interval length than TR for all sample sizes and degrees of endogeneity. We also observe that the CI lengths for both threshold parameters decrease as the sample size increases for the Semi-TR. By contrast, some of the results for TR show that the interval length increases as the sample size increases, suggesting a poor coverage for the actual thresholds.

Empirical application
As one of the most active research areas in recent years, debt overhang hypothesis has been established and frequently tested in the literature of empirical economic growths. On the side of public debt, Reinhart and Rogoff (2010) find that public debt can hurt the economic growth if the debt to GDP ratio is higher than 90%. Checherita-Westphal and Rother (2012) study the effect of public debt on growth by using a dataset including 12 European countries from 1970 to 2012 and find the public debt ratio has a negative impact on long-term growth if the debt ratio is above 90%-100%. Employing Hansen (2000) threshold regression model, Cecchetti andFabrizio (2011), Caner et al. (2010) and Afonso and Jalles (2013) find the public debt has a detrimental effect on growth if the public debt to GDP ratio is above 85%, 77%, and 59%, respectively. By allowing for the endogeneity in both regressors and the threshold variable, Kourtellos et al. (2013) find there is no threshold effect from the public debt.
For the external debt, Poirson et al. (2002) examine the nonlinear relationship between external debt and economic growth for 93 developing countries. They find that countries with different indebtedness  Note: This table reports the simulation results of the estimator for the true slope parameters β 0 1 = 0, β 0 2 = 0.2, β 0 3 = 0.1, and β 0 4 = 0.4. The degree of the endogeneity of the threshold variables, b, is 2. The top panel, TR, reports the results without controlling for the endogeneity. The bottom panel, Semi-TR, reports the results of our proposed semiparametric threshold regression method. We report results with the order of two-dimensional Hermite basis function L n = 3, 4, 5. The first column shows the sample size. The second to the fourth, the fifth to the seventh, the eighth to the tenth, and the last columns report the estimation results for β 0 1 , β 0 2 , β 0 3 , and β 0 4 , respectively.
would have distinct effect of external debt on the annual economic growth per capita. However, Schclarek (2004) fails to find any support for an inverted-U shape relationship between external debt and growth. By contrast, Yen et al. (2019) indicate that the threshold effect is significant and the threshold level of government's external debt to GDP ratio is 33.17%.
Due to the lack of appropriate econometric tools at the time, all of the previous studies in the literature use only one threshold variable, either public debt or external debt, to identify the growth regimes. Such a classification could be dubious since any non linearity present could possibly originate from either source of debt. Moreover, the link of both public debt and external debt to economic growth can be endogenously determined, which requires an econometric method capable of allowing for threshold variable endogeneity. As such, in this application, we apply our semi-parametric threshold model with two threshold variables to investigate this issue in developing countries. Specifically, we examine (i) the Solow growth parametric threshold regression model with two threshold variables  and (ii) the Solow growth semi-parametric threshold regression model with two threshold variables where y t is the growth rate, q 1t is natural logarithm of public debt to GDP ratio, q 2t is the natural logarithm of external debt to GDP ratio, and x t is a vector of regressors that includes a constant, q 1t , q 2t as well as the Solow controlling set including four Solow variables, namely, the natural logarithms of initial income per capita, schooling, investment, population growth (adjusted for depreciation). We use a 10-year averaged pooled panel data covering 54 developing countries in 1980-1989, 1990-1999, 2000-2009 and 2010-2016. The growth rate of real per capita GDP is taken from PWT 9.0. The public debt and external debt to GDP ratio are downloaded from the IMF Historical Public Debt Database and the data bank of the World Bank. All variables are instrumented by their lagged counterparts. We assess model (5.1) and model (5.2) by using the least square estimator of Chen et al. (2012) and our proposed two-step estimation strategy, respectively. To approximate the unknown functions h j (·, ·) for j = 1, 2, 3, 4 in model (5.2), we use a series approximation with two-dimensional Hermite basis functions. The optimal number of basis functions is chosen by the generalized crossvalidation criterion (GCV).  This table presents Monte Carlo results about the nominal 95% confidence interval length of the threshold parameters for the degree of endogeneity of the threshold variables, b = 0.5, 1, 2. The top panel, TR, reports the confidence interval length with the standard OLS estimation method. The bottom panel, Semi-TR, shows the confidence interval length of our proposed semiparametric threshold regression method, using the two-dimensional Hermite basis function with order L n = 3, 4, 5. Table 9 reports the estimation results. For the parametric threshold regression model, the estimated threshold levels of public debt and external debt are around 57.8% and 5.8%, respectively. Interestingly, the results of the parametric model show the external debt is negatively significant in both regime 2 and regime 4. This implies, with all else being equal, higher external debt leads to lower growth if the country is in the high external debt regime irrespective of the volumes of the public debt. In addition, for low external debt country, additional public debt has a positive impact on economic growth.
The estimated threshold levels of both debts from the semi-parametric threshold regression model yield similar result as the parametric counterpart with the values equal to 57.8% and 5.6%, respectively. However, after allowing for the potential endogeneity in both regressors and threshold variables, we find the negative effect of external debt on growth in the high external debt country regime insignificant. Hence, despite sample size limitations, our results suggest we should be cautious about claiming that the main reason for the heterogeneity in the debt-growth relationship in developing countries is the level of the external debt, as supported by the simple analysis with exogenous thresholds. Any policy implications based on such results ignoring the possible presence of endogenous thresholds can be misleading. Note: This table reports the estimation results of model (5.1) and model (5.2). The first column shows the slope regressors. The top panel shows the estimation results of (5.1) using the estimation method proposed by Chen et al. (2012). The bottom panel presents the estimation results of (5.2) using our proposed estimation strategy. We use up to 3 rd order of two-dimensional Hermite basis function, chosen by GCV criterion. All variables are instrumented by their lagged values. Also, "***", "**", and "*" denote significance at the 1%, 5%, and 10% level, respectively.

Conclusion
In this article, similar to Kourtellos et al. (2022), using a nonparametric control function approach, we propose a semiparametric threshold regression models with two endogenous threshold variables. We show the estimation strategy and derive the limiting results of our proposed estimator for weakly dependent data. We assess the finite sample performance of our proposed estimators through a Monte Carlo simulation. Finally, we apply our model to classify the regimes of economic growth by the public debt to GDP ratio and the external debt to GDP ratio.