Local containment policies and countrywide spread of COVID-19 in the United States: an epidemiologic analysis

ABSTRACT We analyse the spatial diffusion of new COVID-19 cases and the countrywide impact of state-specific containment policies during the early months of the COVID-19 pandemic in the United States. We first use spatial econometric techniques to document direct and indirect spillovers of new infections across county and state lines, as well as the impact of individual states’ lockdown policies on infections in neighbouring states. We find consistent statistical evidence that new cases diffuse across county lines, holding county-level factors constant, and that the diffusion across counties was affected by the closure policies of adjacent states. We then develop a spatial version of the epidemiological susceptible–infected–recovered (SIR) model where new infections arise from interactions between infected people in one state and susceptible people in the same or in neighbouring states. We incorporate lockdown policies into our model and calibrate the model to match both the cumulative and the new infections across the 48 contiguous US states and DC. Our results suggest that had the states with the less restrictive social distancing measures tightened them by one level, the cumulative infections in other states would be about 5% smaller. In our spatial SIR model, the spatial containment policies such as border closures have a bigger impact on flattening the infection curve in the short-run than on the cumulative infections in the long-run.


INTRODUCTION
In this paper we assess the spatial diffusion of COVID-19 in the United States and the effect that state-level lockdown policies have on that diffusion during the early months of the pandemic.Our analysis is motivated by the idea that if there were substantial spillovers of new infections between states, then the uncoordinated responses at the state level may have exacerbated the initial outbreak of the disease.But the magnitude of those interregion spillovers is uncertain, as is the extent to which the relatively lax policies of one state contributed to new infections in surrounding states.Indeed, as is now well documented, the virus spread quickly throughout the United States, with notable variations in state-level policy to follow.By 6 March, a majority of US states had at least one confirmed case of the virus, and by 17 March the last state (West Virginia) reported its first case.While the Centers for Disease Control and Prevention (CDC) and other federal entities issued guidance on appropriate measures to mitigate the spread of the virus, the final decisions regarding the timing and the extent of restrictions were made by individual states, and sometimes even counties.The first state-wide 'shelter-in-place' order was issued in California on 19 March, but ultimately only 24 additional states followed suit over the next two weeks.The compliance with social distancing measures also varied greatly across regions (Painter & Qiu, 2020;Simonov et al., 2020).
The goal of this paper is to assess the impact of such a scattered policy response on the countrywide spread of the virus, focusing on the early months of the pandemic.
Our analysis follows a two-pronged approach.First, we estimate spatial econometric models to measure the extent of the spatial diffusion of new cases across regions in the United States.For each model the dependent variable is the seven-day average growth rate of county-level cases, and our primary covariate is a measure of the number of restrictions put in place in the state in which the county resides.The spatial econometric results provide useful descriptive insight into the regional spread of the virus.
Second, we develop a spatial version of the standard epidemiological susceptible-infected-recovered (SIR) model based on Kermack and McKendrick (1927), which has been popularized in the economics literature by Atkeson (2020b).In our model individuals can be infected by people from their own states and from other states.Those inter-state contacts endogenously create a spatial diffusion of the infections, with the speed of such diffusion depending on the model parameters that measure the relative frequency of connections across state lines, potentially altered by social distancing measures.We calibrate the model parameters by minimizing the distance between the data and the model generated series.We then use the model to simulate the impact of lockdown policies implemented in the states with the most restrictive and most lax policies. 1Our main results in that section are twofold.First, if the individual states had the ability to restrict the travel across their borders, infections would be smaller. 2Specifically, cutting the value of the calibrated inter-state spillover parameter by 25% results in the reduction of countrywide infections by almost 40% in the first three months, and by almost 7% in the longrun.Second, if the states with the more lenient lockdown policies tightened them by one level, the cumulative cases in the remaining states would be reduced by 2% in the first three months, and by more than 5% over the 21-month period.
Our analysis contributes to a large literature on the economics of COVID-19.First, we expand the empirical literature that focuses on the spatial aspects of the outbreak.A few studies analysed drivers of spatial heterogeneity in the scope or severity of the COVID-19 pandemic: Desmet and Wacziarg (2020) and Gerritse (2020) looked at US counties, Verwimp (2020) at Belgian municipalities and Ginsburgh et al. (2020) at French regions.Very few papers seemed to focus on understanding the geographical spread of the infections.Kuchler et al. (2020) analyse the correlation between the growth in new cases and the degree of social connectedness with the COVID hotspots, using an aggregated data from Facebook.Cuñat and Zymek (2020) analyse geographical spread of the virus in the UK by incorporating an individual's location and mobility decisions with the SIR model.The most closely related studies are those that focus specifically on the spatial spillovers of cases and cross-regional impact of local policies. 3Our paper has been probably the first attempt to estimate the extent of spatial diffusion of COVID-19 in the United States during the early months of the outbreak.Our main objective is to quantify the extent of inter-state spillovers and the impact of one state's containment measures on outcomes in surrounding states, with close attention paid both to containment measures and possible non-compliance with them.
Second, our results are important for the discussion of policy coordination.It is quite well known that in the presence of inter-state spillovers, an uncoordinated policy response may lead to suboptimal outcomes.This may be because restrictions are too lenient and the virus spreads to other regions or countries (Beck & Wagner, 2020;Rothert, 2021Rothert, , 2022)).It may also be because the restrictions are too harsh, and the recession engineered in one region or industry spills over to other regions or industries (Crucini & O'Flaherty, 2020;2020).Our paper is important in the sense that it offers empirical insights into the actual magnitude of interregional epidemiological spillovers in the United States.Our findings suggest that those spillovers are substantial and therefore emphasize the importance of a coordinated policy response.
Third, the variation in state-level restrictions plays a key role in our analysis.This associates us with a number of papers that focus on the effectiveness of various social distancing measures or on the compliance with the official rules (e.g., Weber, 2020;Deb & Tawk, 2020;Jinjarak et al., 2020).Painter and Qiu (2020) and Simonov et al. (2020) show that compliance with social distancing rules in the United States is correlated with party affiliation, and with exposure to certain opinion-forming programmes on Fox News.Briscese et al. (2020) show that the compliance can vary over time and that people can become 'tired of' restrictions.In our analysis we use a measure of state-imposed restrictions and we also allow for imperfect compliance with them.Our results indicate that stricter social distancing measures introduced by individual states, and better compliance with them, limit the spread of the disease not only within those states, but also in the neighbouring states.Conversely, the lack of such restrictions makes it harder for the state's neighbours to contain the virus.
Finally, following Atkeson (2020b), several papers have contributed to modelling the spread of the pandemic.The SIR model has become the standard in that literature with different papers suggesting different modifications, depending on the paper's focus. 4The closest papers to ours are Bisin and Moro (2020) and Acemoglu et al. (2020).The former builds a theoretical framework that formalizes aspects such as local travel and changes in individuals' behaviour, but their focus is on the local diffusion around the hotspot of the outbreak.The latter develops a multi-group version of the SIR model where infection risks differ across population groups (e.g., nursing homes, schools, etc.) and allows for the transmission of infections between population subgroups.Our main contribution is to develop a spatial version of the benchmark SIR model that allows us to quantify the spillover effects of local infections as well as local lockdown policies on the spread of the virus in other parts of the country.

COVID-19 OUTBREAKS AND POLICY RESPONSES ACROSS THE UNITED STATES
We start by documenting some stylized facts about the time and spatial dimensions of the spread of COVID-19 and containment measures in the United States.

Data
We use three data sources in this paper: . Daily county-level data on confirmed cases and deaths are from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (Dong et al., 2020).
Local containment policies and countrywide spread of COVID-19 in the United States 31 REGIONAL STUDIES

COVID-19 cases across time and space
In this section we conduct some basic visual analysis to demonstrate the extent of the COVID-19 epidemic in the United States.We do so to provide a descriptive look at the dynamics of the inter-state and inter-county spillover of the virus.Figure 1 displays the spread of COVID-19 across the whole country over time.The solid line plots the proportion of counties with confirmed cases over time, demonstrating the breadth of the epidemic.The dotted line reveals its depth, displaying how the share of the population living in a county with at least one confirmed case quickly rose from close to zero to close to one during the month of March.It is clear that from March to June COVID-19 transitioned from a fairly sparse outbreak to a widespread epidemic among counties along both the extensive and intensive margins.
New York was the first state in the United States to experience a very significant outbreak, with its epicentre in New York City (NYC).A simple inspection of the dynamics of case numbers in and around NYC is suggestive of the presence of interstate spillovers (within the Connecticut-New Jersey-New York-Pennsylvania areas).Figure 2 shows a rapid expansion of per capita case numbers in New York during the second half of March, followed by all other states surrounding the NYC metropolitan area in late March and the first half of April.In the case of New Jersey, per capita case counts caught up toand began to outpacethose of New York in mid-April.
Closer inspection at the county level in other metropolitan areas reveals similar patterns.Figure 3 plots confirmed county-level cases per capita over time for four particular metropolitan areas: Chicago, IL (top left); Houston, TX (top right); Miami, FL (bottom left); New Orleans, LA (bottom right).In each case, the county containing the urban centre appears to trigger the area outbreak (respectively: Cook County, IL; Harris County, TX; 7 Miami-Dade County, FL; and Orleans Parish, LA).In the case of Chicago, which lies on the Illinois-Indiana border, the spillover appears to extend to Lake County, IN. Together, these figures provide preliminary evidence of transmission dynamics in which COVID-19 cases emanate from major urban centres into the surrounding areas, and across state lines.

Containment measures across time and space
There is also clear evidence that during the initial shutdown of March and April, government-led containment measures varied geographically and over time.Figure 4 displays the transition from no official containment measures (early March) to universal adoption (of at least some measures) by all states by early April.The most action occurred between mid-March and early April.A total of 60% of the US population lived in a state with no containment measures on 15 March; however, by 1 April nearly 60% of the population lived in a state that had adopted all five factors described by our r-score variable.In the sections below, our spatial-econometric analysis and our spatial model and calibration exercise seek to understand further the geographical spillovers across county and state lines.It is important that there was sufficient geographical variation in the early weeks of the pandemic.Figure 5 shows r-scores in each state from four snapshots  Local containment policies and countrywide spread of COVID-19 in the United States 33 REGIONAL STUDIES spanning mid-March through early April.While it is clear that there are regional trends, most importantly we see that many states have neighbours with different r-scores and those relative differences change over time.
To drill down on the variation in containment measures even further, for spillover effects to be meaningful and identifiable, there must be substantial variation in government-led COVID-19 responses, specifically near state borders.Figure 6 displays the share of the US population living in a county with a different level of containment measures (i.e., r-score value) than at least one of its five nearest counties, over time.During the containment action period of mid-March to early April, this share quickly rose to about 23% of the population, fluctuated between 15% and 23% for the next two weeks, and then settled at 15% for the remainder of April.

ESTIMATING THE SPATIAL DIFFUSION OF CASES
In this section we estimate the spatial characteristics of cases.To do so we estimate the spillovers of cases from county to county in the United States using oft-used spatial econometric models.In this section, the variable new cases is measured as the seven-day average of the growth rate of new cases.

Estimation of spatial correlation
We estimate the following standard spatial models: the spatial Durbin model (SDM), the SDM model with spatially correlated errors (SDM Error), the spatial autoregressive model (SAR), and the spatial lag of X (SLX) model.We start with SDM which, in a panel setting, can be written as: where Y t = Y 1,t , . . ., Y N ,t is an NT × 1 vector of the dependent variable; WY t is the spatial lag term; X t is an NT × r matrix of r exogenous variables; d = d 1 , . . ., d N is the vector of region-fixed effects; and WX t is the spatial lag term for the exogenous variables, which here we refer to as the exogenous spatial interaction term (to distinguish this from the spatial lag variable). 8The spatial term WY t captures the direct effect of the spatial correlation in determining the dependent variable (capturing what otherwise might be an omitted variable).The SDM error model amends the typical SDM to include a spatially correlated error.The SAR model imposes a restriction that u = 0, and the SLX model imposes a restriction that r = 0.In all cases, we employ a contiguity matrix (row-normalized) since this version of the spatial-weighting matrix is the most common in the spatial literature though we considered other versions for robustness (such as an inverse-  distance-based matrix), but do not report those results here for brevity.
For the models above, Y t is the county level cases, and X t is the r-score.We include a county-level fixed effect to control for county-level features such as population density, relative industry, employment shares, and so on (variables which, given the data sources and time period covered, are fixed over the time period).We report the baseline spatial results for each model, along with the direct and indirect spatial effects.We also consider additional specifications, including with the r-score lagged 14 days, and with the lag of the dependent variable included as a regressor.
Tables 1 and 2 report the results for the four models.Table 2 displays results with a lag of the dependent variable included in each model; we refer to this as the dynamic version. 9We report baseline results shown in Tables 1 and 2 for purposes of comparison with traditional approaches to spatial estimation, of which these models represent.Across the model versions, the most robust result is that the coefficient on the spatial lag (new cases interacted with the spatial weighting matrix) is statistically significant.The effect of the r-score depends on the model.Only in the SAR model is the r-score negative and statistically significant.The interaction of the rscore with the spatial weighting matrix is statistically significant and negative in the SLX, SDM and SDM error models.
In addition to the models shown in the tables referenced above, we included time-fixed effects in each model (in addition to the county-level effects).Chung and Hewing (2015) note that omitted common shocks may bias estimates of spatial correlation.With timefixed effects included in each model, the spatial lag is still statistically significant, suggesting the spatial correlation of new cases is robust to including both county and time-fixed effects.However, neither the results for the r-score, nor the results for the r-score interacted with the spatial weighting matrix, are robust to the inclusion of time fixed effects. 10 Overall, the spatial econometric models provide a general picture of the spatial correlation of the virus.The clearest evidence of the spatial spillover is via the statistical significance of the coefficient on the spatial lag in each model.To further understand the spatial characteristics of the growth rate of new cases across counties, we consider an SIR model.

SPATIAL SIR AND COUNTERFACTUAL EXPERIMENTS
The previous sections provide evidence of a substantial degree of inter-state spillovers of COVID-19 across regions.In this section, motivated by that evidence, we construct and calibrate a structural SIR model, similar to the one described by Atkeson (2020b), Eichenbaum et al. (2020), Glover et al. (2020) and Fernández-Villaverde and Jones ( 2020), but with a few modifications.Most importantly, the model allows for infections across   Note: Spatial models estimated from 1 April to 28 June, controlling for county-level fixed effects.Dynamic refers to the lag of the dependent variable included in the model.The 14-day lag of r-score is the value from 14 days prior.The Wald test for spatial correlation is statistically significant in all models.Standard errors are given in parentheses.*p < 0.1, **p < 0.5, ***p < 0.01.
state boundaries.We then use the model to evaluate the extent to which the presence of such spillovers contributed to the spread of the infections in the United States.We also evaluate how lockdown policies implemented in one state impact the rest of the country.
To the extent possible, we try to account for important real-life features, such as the impact of changes in social distancing measures and the state-specific effectiveness of those.Additionally, as pointed out by Fernández-Villaverde and Jones (2020), the identification of parameters in the compartmental models such as the SIR model can be challenging (this is mostly discussed by Atkeson, 2020a, in the context of estimating the fatality rate, which is not the central point of our analysis).We partially address this last issue by gradually increasing the complexity of the model and exploring how that increased complexity affects different parts of the model fit.We believe our analysis can still provide useful insights into both the nature and the potential magnitude of inter-state spillovers during the early stages of the outbreak.
While the previous section estimated a spatial model of a daily panel of 3000 or more counties, in this section we will consider a state-level version of the SIR model.The reason is that with 3000 or more cross-sectional units, the calibration of the highly non-linear SIR model would become computationally infeasible.

The model
The model is an extension of the SIR model that allows us to account for the spatial diffusion of infections.We specify the model in discrete, rather than continuous time.In each period t, the initial population of region n is divided into four disjoint sets: Susceptible (S), Infected (I ), Recovered (R), and Dead (D): and population at time t is: Pop n,t = S n,t + I n,t + R n,t .The new infections in state n result from interactions between susceptible people S n in that state, with infected people in potentially all other states I n ′ , where n ′ = 1, . . ., N .The new infections in state n at time t are given by: In the above expression, the whole term describes the average number of close contacts that a person from state n has with a person from state n ′ in day t.The close contact is defined as one that would result in a transmission of a virus from an infected person to a healthy person.The new infections in state n then occur when an infected person from state n ′ -I n ′ ,tcomes in a close contact with a susceptible person from state n.The probability that a person we come in a close contact with is susceptible is S n,t /Pop n,t .
The parameter b n measures the average number of distinct interpersonal contacts that any person in state n has on a regular day.We allow this parameter to vary across states, given the substantial heterogeneity in the fraction of people living in densely populated areas.We expect, of course, that a typical person in New York will have more distinct interpersonal contacts than a person living in Montana.We assume b n is constant over time.It is certainly possible that the typical number of interpersonal contacts will vary over time in each state, and it is quite likely that this variation will differ by state (e.g., the value of b would likely plummet during the spring break in college towns but skyrocket in the nightclubs or bars in Florida).Given how specific this time variation would be to individual states, we have decided to assume it away, and only allow the model to have a cross-sectional variation in b, which yields 49 parameters to be calibrated.We also consider a simpler specification, with three rather than 49 parameters, where b n is a polynomial function of the population density in region n: Next, k n,t measures the degree to which the personal interactions are reduced by the implemented lockdown policies.The actual reduction in the personal interactions results from a combination of two factors: the official lockdown policies and their effectiveness in the particular region.That effectiveness (from the perspective of the model) can capture at least two important factors.The first factor is related to individuals' compliance and the region's enforcement of social distancing measures.The second factor is related to the fact that each social distancing measure (as recorded in our data) comes with exceptions.Those exceptions may be different in different states, or the same exception can have a different coverage in different states.In general, we should not expect the same restriction that we code as a particular value of the r-score variable to have an identical impact in each state.While we cannot speak to the reasons behind that heterogeneity, we can incorporate it in a straightforward fashion into our model.In order to do that we model k n,t as follows: where i is the value of the r-score variable (0-5), k i is the benchmark effect of restriction i in the region where restrictions are most effective, j n is the relative effectiveness of restrictions in region n, and 1 {i} ( • ) is a characteristic function of a singleton set with element i (essentially, 1 {i} (r-score n,t ) equals 1 if r-score n,t = i, and 0 otherwise).When j n = 1, the effectiveness of each restriction in state n is as high as it can be, when j n = 0 the restrictions are completely ineffective.We normalize j n = 1 in one of the regions (determined endogenously; see Appendix B1 in the supplemental data online for details) and calibrate the 48 remaining values.We also normalize k 0 = 1 (i = 0 corresponds to no restrictions).We also impose a restriction that k i+1 ≤ k i , so a tighter restriction would never lead to more contacts between people.Overall, this adds 48 + 5 ¼ 53 additional parameters to the calibration.
Finally, r(n ′ , n) denotes the spillover parameter from state n ′ to n.We restrict the possible values for r(n ′ , n) so that the matrix is consistent with the row-normalized weighing matrix used in section 3. First, we normalize r(n, n) = 1.Next, we set r(n ′ , n) = 0 when two states n ′ and n are not adjacent and we require it to be positive (even if arbitrarily small) when they are.In that case, we set r(n , where r .0 will be the parameter to be calibrated.In words, the spillover from state n ′ to state n is divided by the total number states that the state n ′ is adjacent to.We do that in order to ensure that if Virginia and Maryland were one state, the total spillover from DC would be the same as it is when they are two separate states. 11 The full dynamics of the model are described by the following equations: where p R is the daily recovery rate and p D is the daily death rate.We set p R = 0.03267 and p D = 0.00067, so that the model implies a 2% mortality and a 30-day duration of an average infection.

Calibration, model fit and parameter values
We calibrate the model by minimizing the sum of squared errors between the data and the model-generated series of both the cumulative and the new confirmed cases per capita in each region and in the entire country.Our vector of parameters has 103 elements: In our calibration we assume that the confirmed cases per capita in each state lag the infections by 14 days and we start our analysis on 1 February 2020 under the assumption that the cumulative infections on that day corresponded to confirmed cases on 14 February 2020.Our two main outcome variables are then defined as: The sum of squared errors between the model and the data is then calculated as: The vector of calibrated parameters û is then given as: The results of the calibration are reported in Table 3, which displays the overall fit of the model as well as the values of selected parameters, except for the individual regions' values of b n and j n .The latter are reported in Appendix A in the supplemental data online.Appendix B presents a summary of the results from calibrations that use different time windows: starting two and four weeks later (on 14 February and 1 March), or ending two weeks earlier (on 14 June).The calibrated values are not overly sensitive to those modifications.
The progression of the model fit reported in the bottom half of Table 3 helps explain how the parameters are identified.We start our calibration by assuming that (1) n , (2) k i = 1 for all i, and (3) j n = 1 for all n (r-score changes make no difference in any state).We then proceed by relaxing one restriction at a time.
First, we relax the assumption that k i = 1 for all i (but we still assume that j n = 1 for all n and that b n is a quadratic function of the population density).There is a dramatic improvement in the model fit along all dimensions, most notably in countrywide levels (R 2 doubles) and first differences (R 2 almost triples).This is because the initial growth of infections (countrywide and across states) is being slowed down by the imposition of restrictions.A model without some variation in the frequency of close contacts over time cannot capture that change in the dynamics of new infections.
Second, we relax the assumption that b n is a quadratic function of the population density in state n and instead model it as a state fixed effect.Not surprisingly, the model fit improves dramatically along one important dimension: the variation in levels of infections across states.The model can now account for 96% of that variation (95% of the variation within and 99% between states).There is also a huge improvement in the model's ability to account for the between states variation in the growth rate of new infections (the R 2 more than triples, from 0.29 to 0.93).
Finally, we let j n differ by state (normalizing its highest value to 1).The main improvement in the model's fit can be seen in the variation of new infections over timeboth countrywide and across states.Intuitively, the model picks up the difference between states that had the same change in their r-score in the data, but experienced a different reduction in the growth rate of infections.It is worth pointing out that the calibrated values of k's are much smaller when we allow j n to vary by state, than when we assume identical, perfect effectiveness of each restriction (j n = 1 for all n).This is not surprising at all; higher values k's in columns two and three of Table 3 reflect the fact that in the average state the effectiveness of that restriction measure is not perfect.
While it may not seem so, our model has relatively few free parameters (even with 49 state-specific values of b and 48 state-specific values of j), because we have over 7000 observations (49 regions over 150 or more days).Despite that, the model does a remarkably good job in replicating the data.Figure 7 plots the total number of confirmed cases observed in the data and generated by the model for the whole country.The time paths of confirmed cases per capita for individual states and for DC are reported in Appendix B in the supplemental data online.The R 2 between the cumulative infections for the whole country in model and in the data is 0.99.For individual states, the model accounts for 98% of the overall variation in cumulative infections, for 97% of the variation within states and 99% of the variation between states.Naturally, the model does a poorer job in accounting for the dynamics of the new infections.For the whole country, it accounts for the 73% of the variation in the data.For individual states, it accounts for 40% of the total variation, 30% of the variation within states, but for the 97% of the variation between states.

Counterfactual simulations
Given the overall good fit of the structural model, we proceed with using the model to perform counterfactual simulations.In all counterfactual simulations, we use our benchmark parametrization with state-specific values of b n and j n .Naturally, any counterfactual simulation of a calibrated or estimated model that does not explicitly model people's behaviour has to address the Lucas' critique (Lucas, 1976).We want to point out that to some extent we already capture the differences across states in behavioural response to restrictions by calibrating a state-specific parameter j n .Ideally, we would have j n vary by the level of imposed restriction, but we are not able to identify that given our data.Given these considerations, our results should be interpreted as showing the impact of changes in restrictions under the assumption that compliance with them remains the same as it was before (i.e., not necessarily perfect).Note: Each column corresponds to calibration results from a different model specification.We start with the first column assuming that b n = b 0 + b 1 density n + b 2 density 2 n , k i = 1 for each value i of the r-score, and j n = 1 for each n.We then gradually ease each restriction to allow k to vary across r-scores, and b n and j n be modelled as state-specific fixed parameters.
Local containment policies and countrywide spread of COVID-19 in the United States 39

REGIONAL STUDIES
Any counterfactual experiment is to some extent ad hoc.We present three that, given the focus of our paper, we find the most interesting and informative.First, we investigate what would happen if states changed the maximum level of r-score they implemented by one.Figure 8 plots how that counterfactual experiment is conducted.On the day when the r-score in each state reaches its maximum observed level in the data, we increase that level by one among states that chose not to impose the most aggressive policies corresponding to r-score ¼ 5 (there are 22 of such states).That way, we are able to get a sense of the epidemiological 12 'damage' caused by more lenient policies in such states, both within that group of states, and outside of it.Conversely, among states in which the maximum r-score in the data equals 5, we reduce that level by one.That way, we are able to get a sense of the epidemiological 'benefit' delivered by the most aggressive policies (within that group of states, and outside of it).
Second, we investigate the role of the spillover parameter, by reducing it by 25%.Third, we combine the two: we repeat the first counterfactual under the assumption that the spillover parameter is 25% lower (of course, we use the simulated series with that lower value of r as a benchmark).In all experiments we assume that social distancing measures which are in place on 28 June 2020, remain in place forever.
Table 4 presents the results from counterfactual experiments.We report both percentage as well as absolute changes relative to the benchmark for both total cases (in thousands) and deaths.We first divide all states into four distinct groups, based on the maximum level that the r-score reached in the data.The results are reported in rows 1-4.We then combine the states where r-score never reached 5 into a single group that we will call 'lax states'.We do that because of the large discrepancy between the number of states within each four groups (there are 27 states where r-score reached 5 and only two states where it reached the maximum value of 2).That way, we split the whole country into two groups, somewhat similar in size.The results for that group are reported in row 5. Next, in row 6 we show the effect of reducing the value of the spillover parameter by 25%.Finally, in row 7, we show the effect of increasing rscore by 1 under the lower spillover parameter.We consider both short-run effects (on 28 June 2020) and long-  Jacek Rothert et al.

REGIONAL STUDIES
run effects (31 December 2021).The first two columns report countrywide effects, columns 3 and 4 report own effects (within each group), columns 5 and 6 report spillover effects (outside each group): . Raising/lowering the max r-score by 1: Consider first the effects of increasing the maximum level of restrictions by 1 (or lowering it by 1 among states where max(r-score) = 5).They are reported in rows 1-5 in Table 4.The first thing we notice is that the own effect is an order of magnitude larger than the spillover effect.Second, the spillover effect in the long-run is stronger than in the short-run.Third, even though the relative spillover effect is much smaller than the own effect, it can be quite sizeable.For example, the combined spillover effect among states with max(r-score) , 5 (i.e., the impact on states with max(r-score) = 5) is −2.2% in the short-run and −5.5% in the long-run.While the percentages look small, the total number of confirmed cases among states with max(r-score) = 5 on 6/28/2020 was 1.7 million, so a 2.2% change means to 39,000 fewer people infected and 790 fewer deaths.The −5.5% in the long-run corresponds to reducing the cumulative confirmed cases by more than 4 million, and reducing deaths by over 84,000 towards the end of December 2021.Even the spillover effect from the two least restrictive states (South Dakota and North Dakota) is not trivial: raising max r-score from 2 to 3 means saving more than 1200 lives in other parts of the country. .Lower spillover: the effect of reducing the value of the spillover parameter is presented in row 6 in each part of Table 4. Unsurprisingly, if spillovers across states' borders are smaller, the total number of infections and deaths is smaller.Interestingly, the percentage difference is larger in the short-run that in the long-run.In other words, if Note: The first column lists the group of states in which the experiment takes place, followed by the description of the experiment.For example, 'r-score ¼ i; add 1' means that we only take states in which the maximum score in the data is i ¼ 2-4, and we increase that maximum r-score by 1 (Figure 8), keeping the paths in other states at their benchmark (i.e., taking the empirical path).'Own effect' refers to the effect within the group of states considered, and 'spillover' to the effects in all other states.All changes relative to the benchmark paths of infections and deaths.Short-run: all series cut on 28 June 2020; long-run: all series simulated up to 31 December 2021.
Local containment policies and countrywide spread of COVID-19 in the United States 41

REGIONAL STUDIES
US states had the ability to restrict travel between them (akin to border closures between countries in the Schengen Zone), the main epidemiological benefit would operate through the flattening of the infection curve. 13.Changing r-score with lower spillover: finally, we consider the countrywide and the spillover effects of changing local restrictions, when the spillover parameter is smaller.The results are reported in row 7 in each part of Table 4 (for ease of exposition, we only report the results for a subset of states).Our results indicate that when the spillover parameter is smaller, the impact of changing the restrictions is diminished substantially in the short-run, but only marginally in the long-run.In the short-run, the percentage spillover effect or raising restrictions in the lax states changes from −2.2% to −1.1%.In the long-run that change is much smaller: from −5.5% to −5.3% (we do no compare absolute changes between rows 5 and 7 because they correspond to different benchmarks).

CONCLUSIONS
In this paper we estimate the magnitude of interregion diffusion of the COVID-19 infections in the early months of the pandemic in the United States.We find evidence that new cases diffuse across county lines, and that the spatial diffusion across counties is affected by the closure policies of adjacent states.Using a spatial version of the SIR model we find that tightening restrictions in states with the less restrictive policies could have reduced the infections in other states by more 2% in the first three months, corresponding to a reduction in the number of confirmed cases by 40,000.Also, estimates from traditional spatial models show the spatial correlation is significant between counties, with some evidence that the r-score of counties in adjacent counties have an effect on the growth rate of a county's new cases.
The presence of inter-state spillovers significantly affected the rate of increase in the number of confirmed cases in the early stages of the outbreak.A unique feature of the United States is that its federal government cannot compel individual states to simply close their borders nor mandate state-specific lockdown policies.This only emphasizes the importance of other tools that promote coordination between states' authorities and regular citizens.First, uniform and consistent messaging on precautionary measures such as masks, or encouraging the compliance with social distancing restrictions and discouraging unnecessary inter-state travel, are examples of such tools that would impact individual behaviour.Second, the evidence provided in the literature thus far (Piguillem & Shi, 2020;2020) suggests that there are potentially huge benefits from implementing a countrywide testing system as early in the outbreak as possibleaimed at reducing the delay between test and resultthus revealing virus hotspots much sooner to potential travellers.Finally, the literature on fiscal federalism may offer some insights into the role the federal government can play when the jurisdictional boundaries do not overlap with the boundaries of regions affected by local policies. 14 This discussion highlights how our findings reveal the importance of carefully coordinated policy responses during the early stages of an outbreak, before a virus becomes endemic, while there is still a chance to substantially slow the spread of, or even eradicate, the virus altogether.Given that by the very nature of the problem any policy implemented or not implemented in response to a viral outbreak creates external effects on surrounding regions, we believe this is a very important area for further research to inform policy makers in combating future outbreaks.
until later in 2020.On the other hand, the most prominent containment measures, listed here, were imposed at the state level during the initial weeks of the pandemic.6. Via Luis Sevillano on GitHub, but originally published in The New York Times and available here.7.Here the outbreak appears to stem from both Harris County (Houston) and Galveston County, which is also a fairly densely populated area.8. See Halleck Vega and Elhorst (2015) for a detailed discussion of the SLX model and the other spatial models employed in this paper.9.There are various versions of spatial models with temporal dynamics.Elhorst (2012) refers to an SAR model augmented with temporal lag of the dependent variable and a temporal lag of the spatial lag as the time-space dynamic model.Pace et al. (1998) provide an example with their smooth transition autoregressive (STAR) model.Brady (2014) provides a brief overview of some of these models.See also Debarsy et al. (2012) for a discussion.10.A table of results with time-fixed effects are available in Appendix B in the supplemental data online.11.Allowing for r(n ′ , n) to have distinct value for each pair of states would yield 49 × 24 ¼ 1176 parameters to be calibrated if we assume symmetric spillovers, and double that if we do not.12.We cannot say anything about suboptimality of the lenient policies since we do not have a properly specified social welfare function that would take into account the economic costs.13.Eckardt et al. (2020) show that border closures between European regions significantly slowed down the spread of the virus.14.See Oates (1999) for a literature review of that topic.Rothert (2022) discusses a few examples of federal fiscal tools that could impact local policies.

Figure 1 .
Figure 1.Affected counties and population over time.

Figure 2 .
Figure 2. Cases in New York and surrounding states.

Figure 3 .
Figure 3. County-level cases in major metropolitan areas.

Figure 5 .
Figure 5. R-scores by state and time.

Figure 6 .
Figure6.Share of the US population living in a county with different containment measures than its five nearest counties.
r-score 14-day lag) models estimated from 1 April to 28 June, controlling for county-level fixed effects.Dynamic refers to the lag of the dependent variable included in the model.The 14-day lag of r-score is the value from 14 days prior.The Wald test for spatial correlation is statistically significant in all models.Standard errors are in parentheses.*p < 0.1, **p < 0.5, ***p < 0.01.Local containment policies and countrywide spread of COVID-19 in the United States 35

Figure 8 .
Figure 8. Counterfactual paths of r-scores.Note: Each panel plots the actual (data) and counterfactual paths of r-scores in the experiment considered: modifying the maximum value of the r-score by 1.The states are (from top-left, clockwise): South Dakota, Arizona, California and Florida. 40

Table 1 .
Baseline results for spatial models of new cases (measured as a growth rate)

Table 2 .
Dynamic version of spatial models.

Table 3 .
Model fit and parameter values.Model specification b n = f(density); b n = f(density); b n as fixed effects j n and b n as fixed effects k n = j n = 1 j n = 1 j n = 1

Table 4 .
Counterfactual simulations: changes relative to the benchmark.