Spatial Correlation Robust Inference in Linear Regression and Panel Models

Abstract We consider inference about a scalar coefficient in a linear regression with spatially correlated errors. Recent suggestions for more robust inference require stationarity of both regressors and dependent variables for their large sample validity. This rules out many empirically relevant applications, such as difference-in-difference designs. We develop a robustified version of the recently suggested SCPC method that addresses this challenge. We find that the method has good size properties in a wide range of Monte Carlo designs that are calibrated to real world applications, both in a pure cross sectional setting, but also for spatially correlated panel data. We provide numerically efficient methods for computing the associated spatial-correlation robust test statistics, critical values, and confidence intervals.


Introduction
This article studies inference about linear regression coefficients estimated from spatially correlated data.In a simple version of the model where all variables are scalars, E[e l |x l ] = 0 and (y l , x l , e l ) are associated with the observed spatial location s l ∈ R d .Spatial correlation invalidates inference about β using the usual heteroscedasticity-robust t-statistic.Several spatialcorrelation methods based on robustified versions of the tstatistic have been proposed, with the most well-known method developed in Conley (1999).These approaches estimate σ 2 = var[n −1/2 n l=1 u l ] where u l = x l e l and n denotes the number of observations.In the usual time series model, the locations s l are equidistant on a line, so d = 1, and σ 2 (or its limit as n → ∞) is called the long-run variance of u.In that context, estimators for σ 2 are called HAC or HAR (Heteroscedastic and Autocorrelation Consistent/Robust).Spatial-correlation robust inference generalizes HAC and HAR time series methods to spatial settings.
As in the time series case, it is difficult to devise spatial correlation robust inference that continues to work well in small samples under empirically plausible degrees of spatial dependence; see Ibragimov and Müller (2010), Sun and Kim (2012), Bester et al. (2016), and Kelly (2019) for corresponding simulation evidence.This is especially true for approaches, such as Conley (1999), Kelejian and Prucha (2007), and Kim and Sun (2011), that rely on the standard normal critical value for the tstatistic by invoking a consistency argument for the estimator of σ 2 .A more promising approach is to explicitly account for the sampling variability in the estimator of σ 2 , and to correspondingly adjust the critical value, as in Bester, Conley, and Hansen (2011) and Sun and Kim (2012)-these are analogs of the fixedb approach of Kiefer and Vogelsang (2005) and the projection approach of Müller (2004) to the spatial setting.However, the appropriate adjustment to the critical value is more complicated in the spatial setting: As demonstrated by Müller and Watson (2022a), the distribution of the locations in space also matters, even in large samples under weak dependence.In particular, if the distribution of the locations is not uniform, then even under weak dependence, the asymptotic null distribution of spatial projection type t-statistics are not Student-t, and fixed-b spatial t-statistics do not have the same asymptotic distribution as under iid sampling, invalidating the usual suggestions for determining the critical value.Müller and Watson (2022a) suggest the spatial correlation principal components (SCPC) method that addresses this challenge and provides explicit robustness under some empirically relevant forms of strong dependence.
In contrast to methods that rely on consistency of the estimator of σ 2 , fixed-b type approaches (in time and space), including SCPC, rely on the stationarity of u l for their large sample validity.What is more, the methods require that the asymptotic properties of weighted averages of x l êl , with êl the OLS residuals, behave like the weighted averages of the demeaned values of u l = x l e l , which essentially requires (x l , e l ) to be stationary and weakly dependent.Yet, in many empirical applications, the data may exhibit non-stationarities and/or strong dependence.A simple but important spatial example is when x l is an indicator for a binary treatment, where treatment is more likely in one region (the "north") than another region (the "south").This article studies the finite-sample properties of spatial correlation inference procedures when x l (and/or e l ) follow empirically relevant processes that may be nonstationary or strongly dependent.Our focus is on the SCPC method proposed in Müller and Watson (2022a) because of its desirable theoretical properties.We use the performance of a version of the Conley (1999) method as a benchmark.(A notable exception to the requirement of stationarity of u l is the cluster-based method suggested by Ibragimov and Müller (2010), which remains asymptotically valid under many forms of variance heterogeneity.However, small sample simulations show that SCPC performs better under stationarity, and SCPC also avoids the issue of how to form the clusters (see Cao et al. 2020).)We set the stage for the analysis in this article by presenting finite-sample results for four data generation processes (DGPs).Each of the models uses a stationary Gaussian process for e l , but differ in the DGP for x l .Specifically, e l is generated by a Gaussian process with covariance function cov(e l , e ) = exp(−c||s l − s ||), where s l and s denote the spatial locations of e l and e , and c > 0 is a parameter that governs the strength of the spatial correlation.This process is the spatial analogue of the time series AR(1) model, with larger values of c implying less spatial correlation.This process will be used in many places in the article, and from now on we use the shorthand e l ∼ G exp (c) to denote a mean zero Gaussian process with this exponential covariance function.We calibrate the value of c to induce a specific average pairwise correlation ρ = [n(n − 1)] −1 l, =l corr(e l , e ), at the sample spatial locations, that is, c = c ρ solves [n(n − 1)] −1 l, =l exp(−c ρ ||s l − s ||) = ρ.Thus, if e l ∼ G exp (c 0.03 ), then the average correlation of e l and e is 0.03 over the sample locations.
In each of the four models, spatial locations are randomly selected from the unit interval, so d = 1 and s l ∼ iid U(0, 1).Thus, this can be viewed as a time series setting with irregular (random) sampling.We consider more interesting and empirically relevant spatial designs in Section 3, but this simple design serves as a useful introduction.The sample size is n = 250.We construct t-statistics using three estimators for σ .The first uses the standard Eicker-Huber-White heteroscedastic robust estimator (HR) that ignores spatial correlation.The second is a Bartlett kernel estimator (Kernel)-this is Conley's (1999) spatial generalization of the well-known Newey-West (1987) HAC standard error.The third is the SCPC estimator proposed in Müller and Watson (2022a) which is analogous to projection based estimators used in time series regressions (e.g., Müller 2004;Phillips 2005;Sun 2013) but tailored to the particular spatial distribution of the data being studied.Standard normal critical values are used for HR and Kernel; an "oracle" bandwidth is used in the Kernel method so the test's rejection frequency is as close as possible to its nominal level.The critical value for SCPC depends on ρ and the spatial locations in the sample; details are provided in Müller and Watson (2022a) and reviewed in Section 2.
Table 1 shows null rejection frequencies for tests with 5% nominal level for each of the four models and three methods.In the first model, e l ∼ G exp (c 0.03 ) and x l = 1.Here, and for the other models as well, HR exhibits a large size distortion, which is expected given the spatial correlation in the data.The Kernel method has a rejection frequency of 11%, which is a substantial improvement over HR, but is still far from its 5% nominal value, even though the best possible bandwidth choice was made.The  for inference on β in regression model y l = x l β + e l with n = 250 and s l ∼ iid U(0, 1).
SCPC method is designed to have small sample validity in this setting, so the null rejection probability is exactly 5%.In Model 2, x l and e l are independent and follow G exp (c 0.03 /2) processes.Here u l = x l e l is non-Gaussian but with the same covariance function as the G exp (c 0.03 ) process.Yet the rejection frequencies of both Kernel and SCPC increase by 3%, which is primarily driven by the difference between x l êl and the demeaned version of x l e l .
In Models 3 and 4, e l ∼ G exp (c 0.03 ) and x l is nonstationary.In Model 3, x l is a zero-mean step function with x l = −0.15for the 85% of the locations closest to s = 0 and x l = 0.85 for the remaining locations closest to s = 1.This induces a further size distortion in both Kernel and SCPC.In the final DGP, x l follows a demeaned random walk, which results in rejection frequencies similar to Model 2.
Table 2, which focuses on SCPC, presents a more nuanced view of the rejection frequencies in Models 2-4 by summarizing rejection frequencies conditional on the values of x l and s l in each experiment.That is, in these experiments, (x l , s l ) are sampled as described above, and rejection frequencies are computed over repeated draws of e l .This process is repeated for many random draws of (x l , s l ), and the table reports selected quantiles of the resulting distribution of rejection frequencies.The 95th quantile indicates null rejection frequencies larger than 11% in Models 2-4: a researcher unluckily enough to observe these values of (x l , s l ) and using a nominal 5% SCPC test, would in fact be using a test with conditional null rejection probability larger than 11%.
Taken together these experiments suggest that SCPC offers substantial improvements on methods that ignore spatial correlation (and improvements over Kernel), but may exhibit quantitatively important size distortions for some DGPs.This raises two questions.First, how well does the method perform for the range of DGPs typically encountered in empirical work?And second, are there modifications that enhance its performance in situations where it performs poorly?We take up both questions in this article.
To answer the first, we begin by augmenting the simple regression model to include additional control variables and allow clustering of observations by location.This setup covers panel data models with fixed effects, difference-in-difference designs, and clustered versions of spatial-correlation-robust standard errors.We model empirically relevant spatial designs by considering the spatial density of economic activity over the continental United States and the location of countries over the globe.We generate the sptatially correlated variables using parametric models, but also by sampling variables from the World Development Indicators (WDI) dataset from the World Bank.This dataset contains hundreds of variables for many countries over several years, and thus provides a wide range of spatial patterns in both cross-section and panel-data regressions.
To answer the second question, we propose a modification of the SCPC method so that it also controls size, conditional on the regressors, for a particular conditionally heteroscedastic Gaussian model for e l .The details are described below.While not offering perfect size control in all settings, this modification provides a quantitatively important improvement over SCPC in several of the models we consider.
The outline of the article is as follows.In Section 2 we outline the spatial regression model and inference methods.Notably, the section presents a method for robustifying SCPC inference to control size after conditioning on the values of the regressors and locations.Section 3 presents experiments using the spatial distribution of economic activity in each of the 48 states making up the continental U.S. and hundreds of variables chosen from the World Bank's WDI dataset.The results indicate that the conditional-SCPC method developed in Section 2.3 offers improved size control over SCPC, which in turn improves upon Kernel for a wide range of empirically relevant process for both cross-section and panel-data regressions with clustered standard errors.Several of these results use G exp (c) processes for the error term, and Section 4 studies the robustness of the conclusions to other spatial processes, including conditional heteroscedasticity in the regression error.Section 5 takes up three issues.The first two are computational: evaluating the critical value for the spatial-correlation robust t-statistics and computing the statistics when n is very large.The third involves using SCPC in IV regressions.The final section offers some concluding remarks.Software for conducting conditional-SCPC inference for regression coefficients is available in STATA and Matlab.Links to the software are available at https://www.princeton.edu/~mwatson/.

Setup
This section has three purposes.First, it presents the spatial regression model used in our analysis.Second, it provides details for t-statistic inference using the SCPC and Kernel methods.Finally, it proposes a robustification of the SCPC critical value so that the method controls size, conditional on the regressors and locations, for a particular heteroscedastic model that we describe.

Spatial Regression Model
The spatial regression model is (2) where l = 1, . . ., n indexes clusters (or groups, entities, etc.) and i = 1, . . ., m l identifies the m l individual observations in cluster l.The total number of observations is N = n l=1 m l .The model with m l = 1 for all l is the cross-section spatial regression model, and m l > 1 allows for clustering and panel data.Cluster l is associated with location s l ∈ R d .The variable x i,l is a scalar, and β is the parameter of interest.The k×1 vector z i,l allows for additional regressors that may include fixed-effect indicators, and e i,l is the regression error.
The following notation will prove useful.The m l × 1 vector x l = (x 1,l , x 2,l , . . ., x m l ,l ) collects the x-variables for cluster l, and the N × 1 vector X = (x 1 , x 2 , . . ., x n ) collects the xvariables in the sample.The vectors and matrices z l , e l , Z, e as well as the n × 1 vector of locations s are defined analogously.We denote sample second-moments as S xx = n −1 n l=1 x l x l and S xy = n −1 n l=1 x l y l , where dividing by n instead of N simplifies the formulas below.We assume (without loss of generality) that the regressors x have been residualized with respect to z so that X Z = 0. (If X o is the vector of untransformed versions of x i,l in (2) then X = (I − Z(Z Z) −1 Z )X o .)This simplifies the expressions for the OLS estimator for β and other statistics.Using this notation, the OLS estimator for β is , where u l = x l e l .The OLS residuals are êi,l , and ûl = x l êl .These are collected in the vectors u, ê and û.
Inference about β is based on the t-statistic where σ 2 is an estimator of σ 2 = var n −1/2 n l=1 u l and β 0 is the null value of β.

Two Spatial Correlation Robust Inference Methods
The results summarized in Tables 1 and 2 rely on two inference methods.The Kernel method estimates σ 2 by where K is the Bartlett weighting function, K(x) = 1 − |x| for |x| ≤ 1 and K(x) = 0 otherwise, and b n > 0 is a bandwidth parameter.Empirical researchers typically use standard normal critical values for the resulting t-statistic, relying on max l, Conley (1999).We implement the "oracle" version of this method that combines the standard normal critical value with the value of b n that minimizes the small sample size distortion in each experiment.This choice of b n is infeasible in practice, but yields a useful lower bound on size distortions associated with the method.(As specified in (4), σ 2 K ≥ 0 is not guaranteed when d > 1.In the case of d = 2, Conley (1999) suggests using the product of two univariate Bartlett kernels based on distance in each of the two dimensions.This produces a positive semidefinite estimator of σ 2 , but it is not invariant to rotation of the spatial axes.In our calculations we use the spectral decomposition of the Bartlett weights in (4) with negative eigenvalues set to zero.) The SCPC method of Müller and Watson (2022a) is based on a principal component estimator of σ 2 based on a pre-specified "worst-case" exponential covariance function conditional on the observed locations.Suppose that u l ∼ G exp (c), and let (c) be the corresponding n × n covariance matrix of u evaluated at the sample locations s, so that the l, th element of (c) is Suppose a researcher desires a test that controls size for values of ρ that may be as large as ρ = ρmax , for a given value of ρmax .The results summarized in Tables 1 and 2 are based on ρmax = 0.03, for instance.Let c min satisfy ρ(c min ) = ρmax , where the notation emphasizes that ρ(c) is a decreasing function of c.Note that (c min ) is the worst case covariance matrix for u in the sense that it induces the largest value of σ 2 among all (c) with ρ ≤ ρmax .
The location model studied in Müller and Watson (2022a) is a special case of the spatial regression (2) with m l = 1, x l = 1 and z l = 0 for all l.The corresponding residuals are u l − u.Let 1 denote a n × 1 vector of 1's and M 1 = I n − 1(1 1) −1 1 .The matrix M 1 (c min )M 1 denotes the associated covariance matrix of u − ū1.For inference about the mean, the SCPC estimator of σ 2 uses the weighted averages of u l −u with weights constructed from the eigenvectors of M 1 (c min )M 1 .These are the principal components of u − ū1 under G exp (c min ) evaluated at the observed locations s.In the regression model considered in this article, u l − u cannot be computed from the data without imposing the true values of (β,γ ) so ûl is used in place of u l − u; a straightforward calculation shows that this substitution is justified in large samples when (e l ,u l ) are stationary and weakly dependent.
Let r j denote the eigenvector of M 1 (c min )M 1 corresponding to the jth largest eigenvalue, normalized so that r j r j /n = 1.(See Figure 6 in Section 5.2 for an example of these eigenvectors.)The SCPC estimator of σ 2 , based the first q principal components, is and the resulting t-statistic will be denoted as τ SCPC .The critical value for τ SCPC is denoted by cv SCPC and is chosen so that the t-test controls size for all values of c ≥ c min (equivalently, ρ ≤ ρmax ) in the asymptotically equivalent location model with u ∼ N (0, (c)).Details for computing cv SCPC are provided in Section 5.1.SCPC inference requires two parameters, q, the number of principal components used to construct σSCPC , and ρ max (or c min ), the largest average spatial correlation for which size is controlled.For a given value of ρ max , Müller and Watson (2022a) suggest choosing q to minimize the expected length of the 95% confidence interval under the iid model with = I n .The optimal value of q involves a tradeoff.With the critical value fixed, the expected length of the confidence interval falls as q increases.But, a larger q means that a larger critical value is needed to control coverage.This tradeoff worsens for larger values of ρ max , so the optimal value of q is a decreasing function of ρ max .There are other possible rules for choosing q, but, as a practical matter it is useful to use the same value of q for tests with different significance levels, and for the location model, results in Müller and Watson (2022a) suggest that minimizing the expected 95% confidence interval length under the iid benchmark yields a value of q that works well for a range of values of c.
The choice of ρ max requires more nuanced judgment.One useful guide is to consider the ratio of the standard deviation of β under spatial correlation to its value under iid sampling.This ratio is given by γ n = √ 1 + (n − 1)ρ.The results reported in Table 1 used ρ max = 0.03 with n = 250 and thus allowed for a value of γ n as large as 2.9.To put this value in perspective, for the sample mean from an AR(1) time series model, γ n = 2.9 corresponds to an AR coefficient equal to 0.79 and a local-to-unity coefficient equal to 53.Ultimately, the choice of ρ max is problem specific and requires the researcher to consider how much spatial correlation may be present in their application.
As discussed in Müller and Watson (2022a) the SCPC method has several desirable properties for inference in the location model.We list four.First, size is controlled in the Gaussian model for u l ∼ G exp (c) for any value of c ≥ c min .Second, in large-n samples, size control is not limited to Gaussian G exp (c) settings, but holds more generally in covariance stationary models with weak dependence.(In the context of the spatial regression model, the asymptotic arguments let n, the number of clusters, grow large.There is no restriction on cluster size or on the number of additional regressors, such as fixed effects, beyond the requirement that u l = x l e l is covariance stationary and weakly dependent.)Third, in finite-sample settings, size control extends to Gaussian models with spectra that are weakly less steep than that of G exp (c min ), and SCPC confidence intervals enjoy a near optimality property in terms of their expected length for confidence intervals that control coverage for this class of spectra.
These results suggest that SCPC inference will perform well in an important class of spatial regression models.Of course, the finite-sample results hold only up to the approximation error in ûl ≈ u l − u, but the large-n sample results-including size control for models outside the G exp (c) class-hold under stationarity and weak dependence.That said, the simulations reported in the introduction suggest non-negligible size distortions in finite-sample settings with stationary, but spatially persistent data (Model 2), and more concerning size distortions when x l is nonstationary (Model 3).As it turns out, these size distortions can be eliminated or mitigated by using an alternative critical value chosen to control size after conditioning on the regressors.We outline that modification in the next section.

Conditional SCPC Inference
The required modification is straightforward: instead of computing the critical value for τ SCPC so that size is controlled in the location model under u ∼ N (0, (c)), c ≥ c min , we additionally impose that size is also controlled in the regression model (2) under a set of conditional distributions of u (and û) given V = (X, Z) (and in both cases, also conditional on the locations s).This amounts to specifying the distribution of e given V, which we assume to be Gaussian, e|V ∼ N (0, e|V ).To motivate our parameterization of e|V , note that randomness in β is driven by the random variable n −1/2 l u l , with conditional variance equal to This highlights two distinct issues: first, how to parameterize the covariance of e between clusters (l = ), and second, how to parameterize within-cluster covariation.
To focus on the first issue, consider the cross-section regression (m l = 1 for all l), so that x l and e l are scalars and betweencluster covariation is the only relevant issue.A straightforward modification of the SCPC formulation assumes that e l |V ∼ G exp (c), so that e l is independent of V. Examination of (6) shows that, using this formulation, realizations with x l x < 0 result in a lower value of var n −1/2 l u l |V than realizations with x's of the same magnitude with x l x > 0. This formulation lacks robustness in this respect.A more robust formulation sets e l = sign(x l ) • a l , where We extend this to the panel-data regression model using where x s l = x l /(x l x l ) 1/2 and a l is scalar with a l ∼ G exp (c).This yields var n −1/2 l u l |V = l, (x l x l ) 1/2 (x x ) 1/2 exp(−c||s l − s ||).Note that this expression for the variance reduces to that of the cross-section regression when m l = 1 for all l.
This formulation has several desirable features.First, it is a tractable extension of SCPC, leading to computations that are no harder than those of the baseline SCPC method irrespective of the cluster sizes m l (we discuss computational issues in Section 5.1).Second, as c → ∞ it yields var n −1/2 l u l |V = n l=1 m l i=1 x 2 i,l which coincides with the variance in the model where e ∼ N (0, I N ).Third, while the conditionally singular within-cluster model for e l might look restrictive at first glance, it plays no role in analysis beyond its effect on the variance u l .Furthermore, a straightforward calculation shows that the conditionally singular model in (7) yields the largest variance of u l among all models with tr[var(e l |V)] = 1, and in this sense maximizes the effect of within-cluster covariation on the variance of β.We note that a potential limitation of the formulation in both the cross section and panel regression is the assumption that e l is mean independent of the entire set of regressors V.
We use (7) to modify the critical value for SCPC as follows.First, we compute cv SCPC as in the last section.We then also compute a new critical value, say cv V , chosen so that the t-test controls size for all values of c ≥ c min (equivalently, ρ ≤ ρmax ) in the regression model with a l |V ∼ G exp (c).The conditionallyrobust critical value for SCPC is then the larger of cv SCPC and cv V , cv C-SCPC = max(cv SCPC , cv V ).
Section 5.1 provides details for computing cv V and cv C-SCPC .We stress that, by conditioning on V, the C-SCPC critical value robustifies τ SCPC inference to allow for arbitrary nonstationary behavior in the regressors.
In what follows we will investigate rejection frequencies of τ SCPC using cv SCPC and using cv C-SCPC .We will refer to these methods as SCPC, and C-SCPC, respectively.for inference on β in regression model y l = x l β + e l with n = 250, s l ∼ iid U(0, 1).c min denotes the lower bound on c used in the construction of the C-SCPC critical value.
Table 3 shows the performance of C-SCPC for the four models considered in Tables 1 and 2, which reported results for SCPC.Recall that these models are cross-section regressions, so m l = 1 for all l, and in Model 2, x l ∼ G exp (c 0.03 /2) and e l ∼ G exp (c 0.03 /2) so that u l = x l e l has a exponential covariance function with parameter c 0.03 .Table 3 shows the rejection for Model 2 with cv C-SCPC computed using c 0.03 /2 (which is the DGP of e l ) and using c 0.03 (which is the correlation of u l ).In both cases, and in the other models, C-SCPC offers improved size control, across the range of x l realizations.Note that in the models in Table 3, e l ∼ G exp (c), which is different than the model used to compute cv C-SCPC , namely ( 7).This provides a hint about the robustness properties of C-SCPC over alternative DGPs, something that we investigate more fully in the following sections.
We find the results summarized in Tables 1-3 both enlightening and encouraging.That said, they are based on what are arguably contrived designs and do not feature panel data.The next section examines the performance of the methods in more empirically relevant designs.

Size Control of Spatial-Correlation Robust t-statistics
The various t-statistics discussed above have probability distributions that depend on two distinct features of the population under study: (i) the spatial process that generates (y(s), x(s), z(s)) for arbitrary locations s ∈ R d , and (ii) the probability distribution that governs which locations are sampled, that is the spatial distribution of s.This section reports results from a variety of experiments in which we choose both features to mimic empirically plausible settings.
In particular, we consider two sets of spatial distributions.The first involves drawing locations from each of the 48 continental U.S. states where the spatial density is proportional to economic activity within the state, and where economic activity is proxied by the intensity of light measured from space.These experiments yield 48 distinct spatial densities with different support (the boundaries for the states) and shapes (the concentration of economic activity within the state), and was used previously in Müller and Watson (2022a).The second set of experiments uses data from countries scattered over the globe.We use two methods for generating realizations of (y(s), x(s), z(s)).The first, used in the U.S. states designs, generates data from parametric models like those used in the experiments already discussed.The second, used in the country designs, uses actual data sampled from the World Bank's World Data Indicators (WDI) dataset.

48 U.S. States
The first set of experiments use data generated from designs that capture the spatial distribution of economic activity in the 48 continental United States.Economic activity is proxied by light intensity which is estimated using the fine grid of light measurements reported in Henderson, Squires, Storeygard, and Weil (2018).States differ by their shape and spatial concentration of economic activity, and Figure 1 shows four examples.Looking across 48 states allows us to study the behavior of the test statistics under a wide range of empirically-relevant spatial distributions.
In the 48-states experiments, realizations of (y(s), x(s), z(s)) are generated from parametric models that focus on stylized characteristics of data used in empirical work.We begin by discussing experiments for cross-section regressions (m l = 1 in Equation ( 2)) and then discuss panel regressions.Table 4 summarizes the experiments for cross-section regressions.The first experiment is a benchmark with x l = 1, e l ∼ G exp (c 0.03 ) and no additional control variables.In Experiments 2-6, x l and e l are generated by independent G exp (c) models, where the experiments differ in the values of c and whether additional control variables are included.In Experiments 7-10, x l is generated by a step function with a shift between southern The regression model is y l = x l β + z l γ + e l with e l = a l or e l = sign(x l ) • a l , with s l ∼ iid from a U.S. state-specific density with examples given in Figure 1.All G exp processes are mutually independent.In Designs 7-12, Step(λ) is a step function that is equal to 0 for the southern most λn observations and 1 otherwise.Random walk x l are approximated by G exp (c) for a small c.G exp (c) 3 in the z l column denotes 3 independent realizations of a G exp process.x l is residualized with respect to z l in all models.All models use n = 250.
and northern locations; these regressors are stylized versions of spatially correlated binary treatments.In Experiments 11 and 12, x l is generated by a G exp (c) model with spatial heteroscedasticity, and in 13 and 14, x l follows a spatial random walk.In all experiments reported in this article, the x l regressor is projected off the control regressors to ensure n l=1 x l z l = 0. We present results for e l ∼ G exp (c) and e l = sign(x l ) • a l with a l ∼ G exp (c), Figure 2 shows the 5th and 95th quantiles, and the mean of these conditional null rejection probabilities.Results are shown for Kernel, SCPC, and C-SCPC.The SCPC critical value is computed using c 0.03 in all experiments.(In Experiments 2-6, x and e follow independent G exp (c) processes so that u l = x l e l has the G exp (2c) covariance function.Thus, the c-value used to compute cv is smaller than the c-value used to generate for u in these experiments.A smaller critical value (and larger rejection frequency) results when 2c is used to compute cv.)The critical value of C-SCPC is computed using the value of c that generated e l .Figure 2 does not show results for the usual heteroscedasticity-robust version of the t-statistic that ignores spatial correlation; suffice it to say the resulting test yields null rejection frequencies that greatly exceed its nominal 5% level.
The results for this rich set of spatial designs is similar to what we found earlier in the simple U(0, 1) spatial design.We highlight four results.
First, despite the optimal choice of bandwidth, the Kernel method suffers from substantial size distortions.This is consistent with a large body of research that finds similar size distortions in time series regressions using HAC standard errors together with standard normal critical values.(These size distor-tions were documented in an early contribution by den Haan and Levin (1997).Lazarus et al. (2018) provide a recent set of simulations and references.) Second, SCPC does a reasonably good job controlling size in several designs, but has uncomfortably large size distortions in some cases.For example, the mean null rejection frequency of SCPC is over 0.15 when x l is generated by a step function in Experiments 8 and 10.
The third result is that C-SCPC has much improved size control, both for models for which the critical value was designed (that is when e l = sign(x l )a l with a l |V ∼ G exp (c) as in panel (b)) but also when e l |V ∼ G exp (c) (panel (a)); this is consistent with the results from the s l ∼ iid U(0, 1) designs previously shown in Table 3.One might wonder about the rejection frequencies below 5% evident in panel (b) for some realizations of {(x l , z l , s l )} n l=1 .These arise when the size constraint is binding for a value of c other than c min , reflecting the fact that size is also controlled for processes that are less spatially correlated than the DGP used in the experiment.
Finally, the fourth result is that the rejection frequencies for Kernel and SCPC are somewhat larger in panel (b) than in panel (a).This reflects the increase in spatial correlation that motived the construction of the C-SCPC critical value, that is replacing x l x (which is sometimes negative) with |x l x |.   4 with φ = 0.9 for within-cluster covariation and the values of c shown in the table for between-cluster covariation (unreported results show little sensitivity to the choice of φ).Results in panel (b) use the same regressors but with e l generated by ( 7), that is e l = x s l a l where a l ∼ G exp (c) is a scalar process.The second difference is that cluster-specific fixed effects are included as controls in all models; this eliminates Experiment 1 in which x l = 1.The third difference concerns the step-function regressor designs in Experiments 8-12.In the panel data models these step functions are used to generate the regressors for the final two observations in each cluster (i = 3 and 4); observations for i = 1 and 2 are set to zero.This produces a differences-in-differences design where treatment occurs in the north during the final two time periods and the observations in the south are untreated throughout.
With these differences, the panel data results in Figure 3 look a lot like its nonpanel counterpart in Figure 2. The most obvious difference are the more pronounced size distortions for Kernel and SCPC in Experiments 8 and 10.These arise because the differences-in-differences design results in a much smaller effective sample size because only 15% of the clusters are treated.
One might wonder how the power of the different tests compare.Since the SCPC and C-SCPC methods are based on the same t-statistic τ SCPC , and only differ in their critical values, the size-adjusted power (or, equivalently, average length of confidence intervals) of SCPC and C-SCPC are identical.The critical value of the C-SCPC method not only depends on the observed locations s l , but also on the observed regressors (x l , z l ), and is "conditional" in this sense.The cost of insisting on such conditional size control is small: An unreported comparison of the C-SCPC method in design (7) used to compute cv C-SCPC with an unconditionally (over (s l , x l , z l )) size-adjusted version of SCPC reveals small differences in average confidence interval lengths.Finally, comparing SCPC (or C-SCPC) with the size-adjusted Kernel method reveals the Kernel method to be somewhat more efficient.This is because σ 2 K is relatively less variable than σ 2 SCPC , so after the size-adjustment that adjusts for the larger bias in σ 2 K , the Kernel method is akin to the oracle t-test that uses the true value of σ 2 in the denominator.In practice, of course, this size adjustment is not feasible, so we do not interpret this as a reason to prefer the Kernel method.

World Development Indicators
In this section we report results from experiments using data that are drawn from the World Bank's World Development Indi-cators (WDI) database.Using these series as regressors and/or error terms allows us to investigate test performance in settings with data that closely matches empirical applications.We begin with a short description of the dataset, and then describe the experiments.
The WDI is a panel dataset containing over 1400 socioeconomic variables for 266 countries.The dataset includes typical economic measurements like GDP, balance of payments and national debt, but also measurements of education, health, and infrastructure.For our purposes, the data present three challenges: they are measured in a variety of units, some variables have missing values for some years and countries, and many series contain large outliers.To address these challenges, we use logarithms for several of the variables, discard series that cover fewer than 100 countries, and treat large outliers as missing data.The supplementary materials describes these steps in detail.We construct two datasets from the WDI.The first contains decadal differences for each variable, w 2015,l − w 2005,l , where w t,l denotes the value of variable w in year t for country l.We use these 2015-2005 differences in cross section regressions.The second dataset is a balanced panel containing 10 years of data, from 2006 through 2015; these data are used for the panel data regressions.There are 749 variables in the cross section dataset and 644 variables in the panel dataset.We compute distances between countries via the great-circle formula.

Mixed Empirical and Parametric Models
The first set of experiments use the WDI variables as regressors with error terms generated by the parametric models used in the 48-states experiments.Six experiments are run.The first two use cross-section datasets, using each the dataset's 749 variables, one at a time, as the x l regressor.The first experiment includes only a constant as a control, and the second adds three additional series chosen at random from the WDI.The other four experiments use the panel dataset, where each of the 644 series is used as x l .The panel regressions include country fixed effects, and the four experiments differ in their inclusion of additional series selected at random from the WDI and time fixed effects.As in the 48states experiments, the error terms are generated as e|V ∼ G exp (c 0.03 ) with within-cluster covariance parameterized by the AR(1) model with φ = 0.9 for the panel data experiments, or as e l = x s l a l where a|V ∼ G exp (c 0.03 ).Table 5 provides a summary.
Results are depicted in Figure 4 and suggest two conclusions.First, size distortions are evident for Kernel and SCPC, although they are not as severe as in some of the 48-states designs.Evidently, the regressors in the WDI exhibit less spatial correlation than those generated for some of the 48-states experiments.Indeed, for the WDI experiments, the mean rejection frequency for SCPC is less than 0.10 in all of the experiments and close to its nominal 0.05 value in the panel regressions.That said, null rejection frequencies exceed 0.10 for more than 5% of the regressors chosen from the WDI datasets.The second conclusion is that C-SCPC has null rejection frequencies close to 0.05 in all experiments and regressors.This is consistent with the earlier experiments and provides reassuring evidence about size control C-SCPC.x l is residualized with respect to z l in all models.e l ∼ G exp (c 0.03 ) (with within cluster AR(1) covariances with φ = 0.9 in the panel data models) or e l = x s l a l with a l ∼ G exp (c 0.03 ).

Empirical Models for both Regressor and Dependent Variable
In a final exercise we generate both regressors and dependent variables from the WDI dataset.This requires some care to ensure that (i) we have a well-defined population with known regression coefficient, and (ii) we control the degree of spatial correlation in the exercise.The idea is to generate simulated data by a discrete Markov chain that more likely selects countries that are geographically close to the previously selected country.With an appropriate restriction of the transition matrix, the stationary distribution is uniform across countries, so that the population regression simply becomes the regression using all countries in the WDI dataset.Furthermore, by varying the degree of preference for conditionally closer countries, we can control the induced average pairwise correlation.
The details are as follows.Suppose we select series labeled y, x, and (if relevant) z from the WDI dataset, and suppose that the resulting (x l , y l , z l ) provide data for l = 1, . . ., M countries.Think of these countries as defining a population.The population regression is then the OLS regression using data for these M countries; let u (l) = x (l) e (l) with e (l) the residual from this population regression.We will compute the various HR, Kernel, SCPC, and C-SCPC tests using random samples drawn from this population.The challenge is to devise a sampling scheme that (i) puts uniform weight (1/M) on each of the locations and (ii) induces a pre-specified average spatial correlation.We do this using a Markov chain, which we now describe.
Let be an M × M Markov transition matrix.As is well known, if is doubly stochastic (so that columns and rows sum to 1), then the stationary distribution is uniform.For a given value of c, let ˜ (c) be the M × M transition matrix such that in row l, the transition probability is proportional to exp(−c||s l − s ||), = l, except that the diagonal of ˜ (c) is zero; thus locations closer to s l are assigned a higher probability than more distant locations.Let (c) be the resulting doubly stochastic matrix after appropriate diagonal scaling, that is, (c) = ˜ (c) for a numerically determined diagonal matrix (which is unique up to scale).We generate a sample of size n of country indices L i ∈ {1, 2, . . ., M}, i = 1, . . ., n from the stationary Markov chain with transition matrix (c), with the first index L 1 drawn uniformly.
The average pairwise correlation of u (L i ) and u (L i+k ) depends on the parameter c and is easily calculated.For all i ≥ 1 and k ≥ 0, we have  5 for a description of the experiments and the notes to Figure 2.
where [ (c) k ] l is the l, th element of (c) k .The average pairwise correlation u (L i ) and u (L i+k ) is therefore given by .
Thus, it is straightforward to numerically obtain the value of c that induces a given value of ρ.
With (c) determined in this fashion, we generate 5000 random samples of countries and corresponding sample regressors and dependent variable using n = 50 and ρ = 0.03.Each set of (y, x, z) from the WDI corresponds to a different population, with associated rejection frequencies for the HR, Kernel, SCPC, and C-SCPC tests.(To ensure that the Markov chain can attain ρ = 0.03 without resampling the same locations frequently, we restrict attention to (y, x, z) variables in the WDI that exhibit spatial correlation.Specifically we use data in which the kernel-estimated variance of u l with a bandwidth of 0.3 of the largest distance in the dataset is at least three times larger than the HR-estimated variance.)We construct two experiments using spatial regressions from the decadal difference dataset; the first includes (y, x), a constant and no additional controls, and the second adds three randomly selected additional series as controls.Similarly we construct two experiments using panel regressions from the 2006 to 2015 dataset, where fixed country effects are included the first, and the second adds three additional randomly selected series.Table 6 shows the 5th, 50th, and 95th quantiles of rejection frequencies across 200 populations generated in this fashion.These experiments differ from those reported earlier in two distinct ways.First, as emphasized above, the values of both e and x correspond to real-world data; so, for example, neither is generated by a Gaussian model as in the earlier experiments.Second, earlier experiments used n = 250, while these use n = 50, where the smaller sample size reflects the fact that many of the series in the WDI are available for as few as M = 100 countries.The rejection frequencies for these experiments share some of the features of the earlier experiments, but there are notable differences.For example, the smaller sample size means that HR rejects less frequently; the rejection frequencies fall from around 50% in Table 1 to half that value in Table 6.Kernel and SCPC improve on HR, but (as in some of the previous experiments) exhibit quantitatively important size distortions.C-SCPC does better, with a median rejection frequency of 8%close the nominal 5% value-although for 5% of the DGPs, the rejection frequency exceeds 0.15 for the spatial (non-panel) regressions.

Conditional Size Control in More General Models
In this section we carry out three exercises to investigate the size properties of C-SCPC under alternative processes for the regression errors.The first considers processes with covariance functions with different decay properties than the exponential G exp (c) processes.The second considers regression errors that are conditionally heteroscedastic.The third considers errors that exhibit excess spatial correlation with ρ > ρ max .

Size Control with Alternative Spatial Covariance Functions
We carry out a straightforward but instructive exercise.Specifically, we carry out C-SCPC inference as in the other experiments-that is using ρmax = 0.03 and using the critical value cv chosen so that conditional size is controlled for e l = x s l a l where a l ∼ G exp (c) for c ≥ c ρmax -however, we now use non-G exp stochastic processes to generate the errors e l or a l .In particular, we generate the errors from five different Matérn processes and study the robustness for all values of ρ ≤ ρmax for each of these five processes.
The Matérn class of stochastic processes is indexed by the parameter θ = (ν, c), where ν and c are positive constants.If a follows a Matérn process, its covariance function σ a (r − s) depends on the locations only through = ||r − s||, with σ a ( ) ∝ (c ) ν K ν (c ), where K ν is modified Bessel function of the second kind.When ν ∈ {1/2, 3/2, 5/2, ∞}, the expression for the covariance simplifies: So ν = 1/2 is the exponential model used throughout the article but the other processes have different decay.We consider these four processes together with the process with ν = 1/4.Gneiting (2013) shows that for ν > 1/2, the Matérn covariance function is not positive definite when distances are measured by the greatcircle formula.So in this section, we instead map each country's location on the surface of the earth into a point in R 3 , and then use the Euclidian norm to compute distances.This effectively also changes the covariance matrices for ν = 1/2, providing an additional degree of separation from the baseline specification.
We rerun the 48-states and WDI experiments from Figures 2-4 using the same regressors but with errors generated by the Matérn processes with ν ∈ {1/4, 1/2, 3/2, 5/2, ∞}.For each process we find a range of values of c to trace out values of ρ ∈ (0.001, ρmax ) with ρmax = 0.03, and generate errors for each value of c.We record the largest rejection frequency over each of the processes and c-values.As in the previous experiments, this generates a distribution of rejection frequencies over realizations of {x l , z l , s l } but now the rejection frequencies represent the largest rejection frequency over this Matérn class with ρ ≤ ρmax .Figure 5 summarizes the results, showing for each experiment the 5th, 50th, and 95th quantiles.
The results are reassuring: the 95th percentile of the rejection frequency across all values of {x l , z l , s l } is less that 0.09 across all five Matérn models, for all values of ρ ∈ (0.001, ρmax ) and for all of the experiments shown in the figure.In most experiments the 95th percentile rejection frequency is less than 0.07.We conclude that C-SCPC inference is robust for this class of DGPs.

Size Control with an Alternative Form of Conditional Heteroscedasticity
The construction of the critical value cv V for C-SCPC was based on a particular form of heteroscedasticity ( 7).Here we investigate the effect of an alternative form of heteroscedasticity.To be specific, let 0 denote the covariance matrix for e from the models summarized in panels (a) of Figures 2-4.We now consider heteroscedastic versions of these error processes with e|X ∼ N (0, D 0 D ) where D is a diagonal matrix whose ith element for the lth cluster is (1 + x 2 i,l ) 1/2 , and where the regressors for each experiment are scaled so that N −1 i,l x 2 i,l = 1.Detailed results are presented in the supplementary materials, but the key takeaways can be gleaned from a few summary statistics from the models in Figure 2(a).For the 13 models with spatial variation in x, the average rejection frequency in the baseline ( 7) models for the 5% level C-SCPC tests is 0.04; this increases only slightly to 0.06.The largest rejection frequency is observed in Model 8, with an average rejection frequency of 0.10.These results indicate that after averaging over realizations of x, size distortions associated with conditional heteroscedasticity of the sort studied here are relatively minor.That said, the results in supplementary materials indicate more dispersion in the rejection frequencies after conditioning on x, so size distortions are larger for some realizations of the regressors.

Size Control and Expected Length of Confidence
Intervals as a Function of ρ max SCPC and C-SCPC are designed to control size for processes with average pairwise correlation bounded above by ρ max .But what if a user misjudges the spatial correlation in the process under study and chooses a value of ρ max that is too small?A simple experiment highlights some of the key issues.In particular, we re-ran the experiments summarized in panel (a) of Figures 2-4, computing the C-SCPC test statistics and critical values as we did previously, that is with ρ max = 0.03, but we generated the regression errors using the exponential model with ρ = 0.10, so they exhibited substantially more spatial correlation than the models in Section 3. Again, detailed results are presented in the supplementary materials, but summary statistics from the models in Figure 2(a) tell much of the story.For the 13 models with spatial variation in x, the average rejection frequency shown in Figure 2(a) is 0.04; this increases only slightly to 0.07 when the errors are generated from the model with ρ = 0.10.This somewhat surprising result occurs because the C-SCPC rejection frequency is determined by the spatial correlation in u = x e, which may exhibit substantially less spatial correlation than e.However, in the location model with x = 1, u = e and the rejection frequency increases to 0.15 in the model with ρ = 0.10 and ρ max = 0.03.
A related question involves the effect of the choice of ρ max on the expected length of confidence intervals.Suppose a researcher constructs a C-SCPC confidence interval using a large value of ρ max (say ρ max = 0.10) to guard against undercoverage, when in fact the data are generated by a model where ρ is much smaller than ρ max .What is the cost of this extra robustness in terms of increased average length of the C-SCPC confidence interval?We consider the extreme version of this where the data is generated by the iid model e ∼ N (0, I n ), and consider ρ max = 0.1, 0.03, 0.01, 0.003.We find that the average ratio of the C-SCPC interval length over the oracle ±1.96σ interval length is approximately 1.55, 1.20, 1.10, and 1.05 in most of the designs, respectively.The cost of the default value ρ max = 0.03 is thus about a 20% increase in confidence interval lengths.See the supplementary materials for further details.

Computing cv SCPC and cv V Critical Values
Let (c) denote the n × n covariance matrix using an exponential covariance function with parameter c evaluated at the sample locations s.Let R denote the n × q matrix whose columns are the eigenvectors of M 1 (c min )M 1 corresponding the largest q eigenvectors.These eigenvectors are used to compute σ 2 SCPC (see Equation ( 5)).The rejection region for the τ SCPC test using a critical value cv corresponds to values of u that satisfy |τ SCPC | > cv, or equivalently The critical values cv SCPC and cv V use this expression to compute rejection frequencies under different probability laws for u.
We discuss these in turn.
The cv SCPC critical value uses u l ∼ G exp (c), so that var(u|s) = (c).It uses the regression model with x l = 1 and z l = 0 so that ûl = u l − ū.Because the columns of R are eigenvectors of M 1 (c min )M 1 , R 1 = 0, so R û = R u; this implies that the values of û in (8) can be replaced with u.
Let h = W u with W = [1, R], and let (c) = W (c)W denote its (q + 1) × (q + 1) covariance matrix.The rejection region can then be written as h D(cv)h > 0, where The rejection frequency under normality is P(h D(cv)h > 0) with h ∼ N (0, (c)).This probability can be efficiently computed using a formula in Bakirov and Székely (2005).(See Lemma 1 in Müller and Watson (2022a).)The critical value cv SCPC is chosen to satisfy where α is the desired size of the test.
The cv V critical value uses u l = x l e l with e l = x s l a l where a l |(X, Z, s) ∼ G exp (c) and computes the probability of (8) conditional on (X, Z, s).Some new notation helps explain the details of the calculation.Let denote the N × n matrix and let s denote the analogous matrix using x s l in place of x l .Then e = s a and u The remaining calculations are identical to those of cv SCPC with ( h, ˜ ) replacing (h, ).

Computing SCPC Eigenvectors for Large n
The SCPC estimator for σ 2 given in (5) relies on the eigenvectors, r i , of the n × n matrix M 1 (c min )M 1 .When n is large (say, n is much larger than 3500) computing these eigenvectors is a challenging computational task.However, it is possible to accurately approximate r i by first computing eigenvectors using only ñ < n randomly selected locations and then extending these to encompass all n locations.This is a version of the so-called Nyström method (see, for instance, Rasmussen and Williams 2005 for discussion and references).
The idea of the extension is most easily explained if we initially ignore the demeaning by M 1 .So for now, suppose we seek the eigenvectors v i of the n If we put the original locations in random order, then the first ñ locations {s l } ñ l=1 are a random subset of size ñ < n of all n observed locations.Let K be the corresponding ñ× ñ matrix computed from the first ñ locations, with ith eigenvalue-eigenvector pair ( λi , ṽi ), where ñ−1 ṽ i ṽi = 1.One natural way to extend the ñ × 1 vector ṽi to the n × 1 vector v i is via v i ≈ vi = λ−1 i K0 ṽi , where the n × ñ matrix K0 has (l, ) element k(s l , s ), l = 1, . . ., n, = 1, . . ., ñ.Note that the first ñ elements of vi are identical to ṽi , and for l > ñ, vi,l = λ−1 so this method approximates the value of v l at the additional locations by a weighted average of the kernel k, with weights proportional to the eigenvector computed from the first ñ locations.
We now show that this approach is asymptotically justified.We assume that the (non-stochastic) sequence of locations {s l } n l=1 is such that the empirical distribution function G n converges in distribution to the continuous distribution G, where G has compact support S ⊂ R d .A calculation shows that under this assumption, also sup s∈S |G ñ(s) − G(s)| p → 0 as ñ, n → ∞, where the probability is induced by the randomness in the subset {s l } ñ l=1 .It therefore suffices to argue that convergence in distribution of G n induces convergence of the eigenvectors in a suitable sense.
To this end, let L 2 G denote the Hilbert space of functions S → R with inner product f 1 , f 2 = f 1 (s)f 2 (s)dG(s).Then by Mercer's Theorem, k has the representation k(s, r) = ∞ i=1 λ i ϕ i (s)ϕ i (r), where (λ i , ϕ i ) ∈ R × L 2 G are eigenvalues and eigenfunctions of k, with eigenvalues ordered from largest to smallest, normalized so that ϕ i (s)ϕ j (s)dG(s) = 1[i = j] and ϕ i (•) = λ −1 i k(•, r)ϕ i (r)dG(r) for λ i > 0. Now proceeding as in Rosasco, Belkin, and Vito (2010), Müller and Watson (2022a), and Müller and Watson (2022b) shows that if the eigenvalue λ i k(s l , s ), so that the (l, ) element of M 1 KM 1 is equal to kn (s l , s ).Let ( ωi ,r i ) with ñ−1 i r i ri = 1 be the eigenvalues and eigenvectors of the corresponding matrix M1 K M1 computed from the first ñ locations.Then in analogy to (9), we extend the ñ × 1 vector ri to ri ≈ r i via ri,l = ω−1 where the second equality exploits ñ l=1 ri,l = 0.The above mentioned convergence results are not directly applicable, since the demeaned kernel kn is a function of the observed locations {s l } n l=1 , and hence not fixed.However, one would expect that as n → ∞, kn is well approximated by the population demeaned kernel This shows that the numerical approximation r i,l ≈ ri,l in (10) is formally justified under n, ñ → ∞ asymptotics, and numerical experiments suggest that results become fairly stable with ñ = 1000.In practice, the approximation (10) can be carried out for several random subsets of ñ locations, followed by a (sample) principle component analysis to extract the best approximation to the space spanned by the first q eigenvectors.This further improves the accuracy of the approximation, and reduces the artificial randomness induced by the selection of ñ locations.The resulting algorithm has O(n) running time, which compares favorably to the O(n 2 ) running time of a basic implementation of a kernel variance estimator.This algorithm is implemented in the STATA and Matlab code for C-SCPC mentioned in Section 1.

SCPC and C-SCPC for IV Regression
Consider a version of (2) in which ζ is the coefficient on a scalar endogenous regressor, say p i,l : y i,l = p i,l ζ + z i,l γ + e i,l where p i,l is potentially correlated with e i,l .Suppose x i,l is a scalar instrument for p i,l , z i,l is a vector of exogenous regressors and v i,l = (x i,l , z i,l ) satisfies the assumptions discussed in the previous sections.Minor modifications of the methods discussed above provide SCPC and C-SCPC inference for ζ .The case is more complicated for conducting C-SCPC inference using the τ SCPC IV t-statistic.The complication arises because the IV residuals are given by êIV = Me with M = I N − [P Z](V [P Z]) −1 V , so that M depends on the (potentially endogenous) regressors P.Under the C-SCPC heteroscedastic model, e l = x s l a l , so e = s a and êIV = M s a.Thus, if a l |(P, X, Z) ∼ G exp (c) for c ≥ c min , then C-SCPC has guaranteed size control conditional on (P, X, Z).That said, if E(e i,l |p i,l , x i,l , z i,l ) = 0, that is, if p is endogenous, then a l is not independent of (P, X, Z) invalidating the small-sample conditional size control of C-SCPC.
This discussion has assumed that x i,l is a scalar.When there are multiple instruments, say x i,l , then SCPC inference based on βIV can be conducted using the scalar instrument x i,l = ŵ xi,l , where ŵ is a vector of weights computed, for example, by 2SLS.

Figure 1 .
Figure 1.Spatial densities for four U.S. states NOTE: These are densities of light, as measured from space, estimated from data provided in Henderson, Squires, Storeygard, and Weil (2018).

Figure 2 .
Figure 2. Null rejection frequencies for 5% nominal tests: spatial regressions, 48-states experiments.NOTE: The bars show the 5th through 95th quantiles of the distribution of rejection frequencies conditional on the regressors for nominal 5% tests.Squares denote the mean of the distribution, which is the unconditional rejection frequency.See Table4and the text for a description of the experiments.

Figure 3
summarizes results from a panel version of these designs with m l = 4 observations in each of n = 250 clusters.The panel data models differ from their non-panel counterparts

Figure 3 .
Figure 3. Rejection frequencies for 5% nominal tests: spatial panel regressions, 48-states experiments.NOTE: See notes to Figure 2. The inclusion of fixed effects eliminates Experiment 1 from the spatial panel regressions.

Figure 4 .
Figure 4. Rejection frequencies for 5% nominal tests: spatial regressions, WDI experiments.NOTE: See Table 5 for a description of the experiments and the notes to Figure 2.

Figure 5 .
Figure 5. Largest rejection frequency across a set of Matérn processes.NOTE: See the text for a description of the experiments and the notes to Figure 2.
is unique, then sup s∈S || φi (s) − ϕ i (s)|| p → 0, where φi (•) = λ−1 i ñ =1 k(•, s )v i, .These authors also develop corresponding results for nonunique eigenvalues.These ideas and results extend to the problem of consideration here, where we seek to approximate the eigenvectors r i of a demeaned version of K, namely M 1 KM 1 .Define kn (r, s) = k(r, s) − n −

Figure 6 .
Figure 6.Spatial density and eigenfunctions for Texas.NOTE: Panel (a) shows the spatial density of light in Texas.Panels (b)-(d) show three of the estimated eigenfunctions of the kernel k using c 0.03 .
Weak-instrument robust inference about ζ uses theAnderson and Rubin (1949) regressiony i,l − p i,l ζ 0 = x i,l β + z i,l γ + e 0 i,l (11)where the null hypothesis ζ = ζ 0 corresponds to β = 0 in the regression (11).Spatial correlation robust SCPC/C-SCPC tests of the β = 0 null proceed as described in Sections 2.2 and 2.3.Note that C-SCPC inference for β = 0 in (11) (equivalently ζ = ζ 0 ) produces valid inference in Section 2.3's heteroscedastic Gaussian model conditional on the instrument and exogenous regressors (X, Z).When x i,l is known to be a strong instrument, spatial correlation robust inference can be carried out using the IV t-statistic constructed with the SCPC estimator for σ .The IV estimator isζ IV = S −1xp S xy (recall that x and z are uncorrelated in the sample, so X Z = 0) soζ IV = ζ + S −1 xp n −1 n l=1x l e l .The resulting IV t-statistic has the same form as (3) with ζ IV replacing the OLS estimator β and S xp replacing S xx .SCPC inference then follows as in Section 2.2 with ûIV l = x l êIV l where êIV i,l = y i,l − p i,l ζ IV − z i,l γ IV .

Table 3 .
Distribution of conditional null rejection frequencies of nominal 5% C-SCPC tests.

Table 4 .
Spatial regression designs using U.S. states spatial distributions of locations.

Table 5 .
Spatial regression designs using the WDI dataset.

Table 6 .
Rejection frequency of nominal 5% level tests using WDI-Markov-Chain design.