Estimating a spatial autoregressive model with autoregressive disturbances based on the indirect inference principle

ABSTRACT This paper proposes a new estimation procedure for the first-order spatial autoregressive (SAR) model, where the disturbance term also follows a first-order autoregression and its innovations may be heteroscedastic. The estimation procedure is based on the principle of indirect inference that matches the ordinary least squares estimator of the two SAR coefficients (one in the outcome equation and the other in the disturbance equation) with its approximate analytical expectation. The resulting estimator is shown to be consistent, asymptotically normal and robust to unknown heteroscedasticity. Monte Carlo experiments are provided to show its finite-sample performance in comparison with existing estimators that are based on the generalized method of moments. The new estimation procedure is applied to empirical studies on teenage pregnancy rates and Airbnb accommodation prices.


INTRODUCTION
Spatial autoregressive (SAR) models have been widely used in many disciplines of social sciences by extending the notion of autocorrelation from the traditional time domain to space. Spatial correlation may arise from different sources such as strategic interaction, spillover, copycatting and general equilibrium effects, to name just a few. In this framework, space can be defined not only in the geographical sense but also from economic and social perspectives. A classical treatment of this subject is Cliff and Ord (1981) and a more recent study is LeSage and Pace (2009). This paper considers the first-order SAR model with first-order autoregressive disturbances (SARAR(1,1) for short), which extends the popular first-order SAR (SAR(1)) model by allowing for a more general structure of spatial correlation that may originate from both the observable and unobservable. Under the assumption of homoscedastic error innovations, Kelejian and Prucha (1998) proposed a generalized spatial two-stage least squares (GS2SLS) procedure to estimate SARAR(1,1). Lee (2003) proposed the best GS2SLS by replacing the instrumental variables (IV) matrix of the GS2SLS estimator in Kelejian and Prucha (1998) with the asymptotically optimal one. Lee and Liu (2010) discussed the generalized method of moments (GMM) and proposed the best GMM estimator. Burridge (2012) discussed how to solve for the quasimaximum likelihood (QML) estimator for the SARAR(1,1) model by a numerical search algorithm and recently Liu and Lee (2019) derived the asymptotic properties of the QML estimator in SARAR(1,1). Kelejian and Prucha (2010) extended their GS2SLS to allow for heteroscedasticity in error innovations and Jin and Lee (2019) compared the generalized empirical likelihood (GEL) and GMM estimators in this general framework. Taşpınar et al. (2019) considered various ways, robust to heteroscedasticity, to improve the finite-sample properties of the GMM estimator in SARAR(1,1). 1 In comparison with the QML, the IV/GMM approach enjoys not only computational simplicity (in that it does not need to calculate the determinants of matrices involving the spatial weight matrices, which is required for the QML) but also robustness against departure from homoscedasticity.
The existing IV/GMM literature appears to rely on the so-called IV, possibly together with some linear and quadratic moment conditions (associated with the error term), to estimate SARAR models. Different choices of IV and moment conditions can result in different estimation methods with different numerical optimization procedures. They are also directly related to the complexity of the resulting asymptotic variance of the corresponding estimator. This paper takes a different approach that does not rely on IV or moment conditions. In particular, it estimates model parameters by matching the simple ordinary least squares (OLS) estimator of the two SAR coefficients (one in the outcome equation and the other in the disturbance equation) with its approximate analytical expectation. This approach is largely in line with the indirect inference (II) procedure of Gouriéroux et al. (1993) and Smith (1993). However, the original II is simulation based in the sense that the relevant expectation is approximated by the average of simulated estimates and one needs to make distributional assumptions on the pseudo error term in simulations. Kyriacou et al. (2017) studied the SAR(1) model by working out the approximate expectation of the OLS estimator of the SAR coefficient and then matching with the inconsistent OLS estimator to 'solve' for the SAR parameter. Nevertheless, their model does not include exogenous regressors and the disturbance term is serially uncorrelated and homoscedastic. Recently, Kyriacou et al. (2019) and Bao et al. (2020) have extended the SAR(1) to include exogenous regressors with possibly heteroscedastic errors. 2 This paper considers a more general framework where disturbances are spatially correlated and innovations of the error process are heteroscedastic. Just as an ARMA process relative to an AR process in time series, a SARAR model, compared with a SAR specification, is able to describe a richer spectrum of interactions and heterogeneity among cross-sectional units. However, the presence of spatial correlation in the error term introduces non-trivial technical difficulty. First of all, one cannot simply ignore the correlation in the error process to estimate the outcome equation by following the approach of Kyriacou et al. (2019) or Bao et al. (2020) that is robust to error heteroscedasticity. The binding function (pertaining to the SAR parameter in the outcome equation) involves the SAR parameter in the error process, so one cannot solve the binding function to estimate the SAR parameter in the outcome equation. Second, the traditional Cochrane-Orcutt procedure that aims for dealing with error correlation does not work, since the OLS estimator of the SAR parameter in the error process is not consistent even if one knows the SAR parameter in the outcome equation. The novelty of this paper is to design two binding functions, one for each of the SAR parameters such that both are expressed in terms of the observable data. The first binding function related to the SAR parameter in the outcome equation depends on the SAR parameter in the error process. The second binding function, since it is built from a consistent residual vector, which in turn depends on the SAR parameter in the error process, involves both SAR parameters. Given the observable sample data, the two resulting binding functions constitute a system of two equations in terms of the two unknown SAR parameters. Similar to the IV/GMM estimator, the II estimator proposed in this paper is computationally simpler relative to the QML estimator and is robust to heteroscedasticity. In comparison with the IV/GMM estimator, the II estimator possesses three salient features. First, it is free of the choice of IV or moment conditions. This may be relevant when one is unsure about the choices of IV and moment conditions or when one is daunted by the complexity of the optimal weight matrix, as it involves the error innovation variance matrix and this may produce some undesirable consequences in the numerical optimization when the estimated variance matrix is used in the weight matrix. Second, the II procedure may enjoy some degree of computational advantage. It is based on a two-dimensional numerical search since it solves for the two SAR parameters (λ and r, appearing in the outcome equation and the error process, respectively) using two sample binding functions established from the simple OLS procedure. Once the two spatial parameters are estimated consistently, the coefficient vector b associated with exogenous regressors in the outcome equation can be easily estimated by the usual OLS procedure. The GS2SLS of Prucha (1998, 2010) involves two steps that estimate the SAR parameters separately. In the first step, l and b are estimated by 2SLS based on some IV. In the second step, r is estimated by GMM using some quadratic moment conditions. These moment conditions are designed by some careful choices of the relevant matrices appearing in the quadratic forms in the error innovations. (And such careful choices also deal with heteroscedasticity.) The GMM estimator in Lee and Liu (2010) and Jin and Lee (2019) estimates l, r, and b jointly by using some linear and quadratic moment conditions associated with the error innovations. The numerical search in GMM is over a (k + 2)-dimensional parameter space, where k is the dimension of b. The optimal weight matrix (in formulating the quadratic form of the objective function in GMM and in the second step of GS2SLS) involves the error innovation variance matrix and to make it feasible one typically needs to estimate it based on some initial consistently estimated parameters. Third, the II procedure estimates jointly l and r first and then b is estimated by the usual OLS plug-in procedure. Therefore, essentially, it is also a two-step procedure. Recall that the GS2SLS of Prucha (1998, 2010) estimates l and b first by 2SLS and then r by GMM. Lee (2007) and Yang (2015) emphasized that the spatial coefficients are the main source of bias in model estimation and the main cause of difficulty in bias correction in SAR models. In fact, Monte Carlo experiments in this paper show that in the first step of GS2SLS, it can happen that both l and (some elements of) b may be estimated with relatively large magnitudes of biases. This happens because the first step of GS2SLS totally ignores the degree of spatial correlation in the error term. The II procedure on the other hand takes care of the two spatial coefficients jointly in one step.
The plan of this paper is as follows. The next section describes the model specification and the main assumptions used in this paper. The third section discusses the estimation procedure. In particular, the asymptotic behaviour of the (inconsistent) OLS estimator is discussed and then the II estimation procedure is described and its asymptotic properties are provided. The fourth section reports results from Monte Carlo experiments. It shows that the II estimator performs better than the GS2SLS estimator of Prucha (1998, 2010) and the GMM estimator of Jin and Lee (2019) in finite samples when a sparse county contiguity matrix is used. It is found that the GS2SLS-and GMM-based inference procedures can give rise to severe size distortions when the degree of spatial correlation in the error process is high. In contrast, the II-based t-test delivers excellent finite-sample size performance. When the spatial weight matrices are relatively dense, however, the three estimators can perform poorly in small samples. The fifth section contains two empirical studies, one on teenage pregnancy rates and the other on Airbnb listing prices. The sixth section concludes. Technical details and additional simulation results are collected in the Appendix in the supplemental data online.
Throughout, tr denotes matrix trace operator, Dg(a n ) denotes a diagonal matrix with the vector a n spanning the main diagonal, and Dg(A n ) is a diagonal matrix that collects the diagonals elements of the square matrix A n . The subscript 0 is used to signify the true parameter value.

MODEL SPECIFICATION
Consider the following SARAR(1,1) model: y n = X n b + lW n y n + u n , u n = rM n u n + v n , where y n is an n × 1 vector of observations on the dependent variable, X n is an n × k matrix of observations on k exogenous deterministic regressors with coefficient vector b, u n is an n × 1 vector of regression disturbances, v n is an n × 1 vector of innovations, l and r are the SAR coefficients, and W n and M n are n × n matrices of spatial weights. For the ease of presentation, let S n (l) = I n − lW n , R n (r) = I n − rM n , G n (l) = W n S −1 n (l), F n (r) = M n R −1 n (r), and H n (r) = I n − R n (r)X n (X ′ n R ′ n (r)R n (r)X n ) −1 X ′ n R ′ n (r). When a matrix is presented without its argument, it means that it is evaluated at the true parameter value. That is, S n = S n (l 0 ), R n = R n (r 0 ), G n = G n (l 0 ), F n = F n (r 0 ), and H n = H n (r 0 ). With such a set of notation, the equilibrium solution of the process is y n = S −1 n X n b 0 + S −1 n R −1 n v n . Throughout, the following assumptions are made.
Assumption 1. Assumption 3. For 1 ≤ i ≤ n, the innovation terms v i,n in v n = (v 1,n , · · · , v n,n ) ′ are mutually independent with E(v i,n ) = 0, E(v 2 i,n ) = s 2 i,n , and E(|v i,n | 4+d ) , 1 for some positive constant d. Assumption 4. (i) l 0 and r 0 are contained in compact parameter spaces L and P, respectively. (ii) For any admissible l [ L and r [ P, the row and column sums of S −1 n (l) and R −1 n (r) are bounded uniformly in absolute value.
(i) The elements of X n are uniformly bounded. (ii) The limit: exists and is non-singular. Assumption 6. Let S n = Dg(s 2 1,n , . . . , s 2 n,n ). Then: J = j 1 j 12 j 12 j 2 exists and is positive definite, where: in which E n = H n R n G n R −1 n − Dg(H n R n G n R −1 n ) and L n = F n − Dg(F n ). Assumption 1 (ii) is a normalization rule often assumed in the literature to exclude 'self-influence'. Assumptions 1 (i), 2 and 4 limit the degree of spatial dependency and are originated by Kelejian and Prucha (1998). Assumption 3 is the same as in Kelejian and Prucha (2010) and Jin and Lee (2019), which allows for heteroscedasticity in the innovations. If one further assumes that the innovations are i.i.d., then the QML can be used. In Kyriacou et al. (2017), there are no exogenous regressors and the disturbances contain no SAR structure and are i.i.d. Their Monte Carlo experiments showed that their II estimator is comparable to the QML estimator while losing efficiency in some cases. One would expect that the II estimator introduced in this paper may lose efficiency relative to the QML estimator if the innovations are i.i.d. Lee (2002) emphasized that Assumption 5 (ii) is related to an identification condition for estimation in the least squares and IV frameworks and it rules out possible multicollinearities among X n and G n X n b 0 for large n. Assumption 6 is related to the asymptotic variance of the II estimator.

ESTIMATION PROCEDURE
This section provides the main results. The OLS estimator is briefly discussed first and its asymptotic distribution, when properly recentred, is presented. Since the recentring terms involve the unknown model parameters as well as the variance matrix of the error vector, the recentred OLS estimator is not usable in practice. The II estimator solves for the unknown parameters by utilizing two binding functions that do not rely on the unknown variance matrix. It is then shown that the II estimator is consistent and asymptotically normal.

The OLS estimator
If the true value of r is known, the Cochrane-Orcutt-type transformation to (1) yields: The OLS estimator of l 0 for the transformed model (2), depending explicitly on the true value r 0 , is given by:l The probability limit of the ratio y ′ n W ′ n R ′ n H n v n /y ′ n W ′ n R ′ n H n R n W n y n is non-zero so the OLS estimator of l 0 , even if r 0 is given, is not consistent. One cannot follow Kyriacou et al. (2017) to seek a consistent estimator of l 0 by building a binding function that takes the (approximate) expectation of the ratio as (3) depends on the unknown value r 0 , so is the resulting binding function. One cannot solve for l without knowing r.
The strategy in this paper is to build another binding function based on the OLS estimator of r 0 that is constructed from a consistent residual vector, namely: where u n = u n (l 0 , The OLS estimator, as defined in (3) and (4), is not feasible, since it involves the unknown l 0 and r 0 . It is not consistent either. However, one can properly recentrel(r 0 ) andr(l 0 , r 0 ) and the resulting recentred estimator, though still infeasible, achieves consistency. One choice of the recentring term forl(r 0 ) is n v n , one can see that the random parts of y ′ n W ′ n R ′ n H n R n y n are linear and quadratic forms in the random vector v n . Then from Lemma 1 (see the Appendix in the supplemental data online), n √ (l(r 0 ) − l 0 − c l ) converges to a zero-mean normal random variable. Forr(l 0 , r 0 ), the recentring term is not obvious. By , which converges to a zero-mean normal random variable.
The correction terms (c l = tr(S n H n R n G n R −1 n )/y ′ n W ′ n R ′ n H n R n W n y n forl(r 0 ) and c r = tr(S n F n )/y ′ n S ′ n R ′ n H n F ′ n F n H n R n S n y n forr n (l 0 , r 0 )) involve, as usual, the unknown parameters. Moreover, they contain the annoying S n . Since tr(S n H n R n G n R −1 n H n D n H n R n S n y n in the correction term forl(r 0 ) can yield a useful asymptotic distribution result. (And similarly, replace tr(S n F n ) in the correction term forr n (l 0 , r 0 ) with v ′ n K n v n = y ′ n S ′ n R ′ n H n K n H n R n S n y n , where K n = Dg(F n )). It turns out the answer is positive.
Theorem 1. Under Assumptions 1-6, the OLS estimator (l(r 0 ),r(l 0 , r 0 )) ′ , as defined in (3) and (4), has the following asymptotic distribution: Now the recentring terms involve only the sample data and model parameters, but not the nuisance matrix S n . This makes it feasible to design the II estimator that corrects the inconsistency of the original OLS estimator.

The II estimator
The asymptotic distribution result (5) can be used to design an estimator of (l 0 , r 0 ) ′ in the spirit of II by matching (l(r 0 ),r(l 0 , r 0 )) ′ with its (approximate) expectation. Recall l(r 0 ) = y ′ n W ′ n R ′ n (r 0 )H n (r 0 )R n (r 0 )y n /y ′ n W ′ n R ′ n (r 0 )H n (r 0 )R n (r 0 )W n y n and from (5),l(r 0 ) centres around: where the dependency of various matrices on (l 0 , r 0 ) ′ is explicitly expressed. Therefore, a binding function for finding the true parameter value l 0 is: Of course, b 1n (l, r) = 0 alone cannot solve for l since it involves two unknowns. It has to be combined with a second binding function pertaining to r, which follows similarly: The II estimator (l II ,r II ) ′ of (l, r) ′ is thus defined as the root of b n (l, r) = (b 1n (l, r), b 2n (l, r)) ′ . Assumption 7. For (l, r) ′ [ L × P, (i) Pr (lim n 1 b n (l 0 , r 0 ) = 0) = 1 and Pr (lim n 1 b n (l, r) = 0) = 1 for any (l, r) ′ = (l 0 , r 0 ) ′ ; (ii) the Jacobian B n (l, r) of b n (l, r) is non-singular almost surely; and (iii) B n (l 0 , r 0 ) a.s. B, where B is non-singular.
Essentially, Assumption 7 (i) ensures the existence and uniqueness of the root of b n (l, r), at least in large samples. 4 Assumptions 7 (ii) and 7 (iii) are needed to derive the asymptotic distribution of the resulting II estimator.
Theorem 2. For model (1), under Assumptions 1-7, the II estimator (l II ,r II ) ′ of (l, r) ′ , defined as the root of b n (l, r), has the following asymptotic distribution: Given thatl II andr II are consistent,b II is necessarily consistent. Its asymptotic variance, however, is different from the traditional OLS variance formula given the additional uncertainty introduced byl II andr II . The following theorem gives the joint asymptotic distribution of l II ,r II , andb II .
Theorem 3. For model (1), under Assumptions 1-7: where: is assumed to exist and be positive definite, and: In practice, one can estimate the asymptotic variance matrix V by replacing all the unknowns appearing in G, g, and G b with their consistent estimates and the limits with the sample analogues. Further, one may replace S n withŜ n = Dg(v nv ′ n ), wherev n =Ĥ nRnŜn y n witĥ H n = H n (r II ),R n = R n (r II ), andŜ n = S n (l II ). 5 Therefore, the estimated V may be denoted byV n =V n (l II ,r II ,b II , y n , X n ).

SIMULATION RESULTS
In this section, Monte Carlo simulations are conducted to illustrate the finite-sample performance of the proposed II estimator, in comparison with the GMM estimator of Jin and Lee (2019) and the GS2SLS estimator of Kelejian and Prucha (2010). 6 The spatial weight matrix W n is the row-normalized county contiguity matrix used in Lin and Lee (2010) with n = 761 and M n = W n . 7 The exogenous variables include a constant term and two independently distributed random variables, one following a normal distribution with mean 3 and variance 1 and the other  following a uniform distribution on the interval [ − 2, 2]. In the experiment, b 0 is fixed at (0.8, 0.2, 1.5) ′ and l 0 is positive (varying from 0.9 to 0.1) in Table 1, where r 0 takes on a wide range of values (positive, negative, and 0.) 8 (The online supplemental Appendix also reports the results under negative l 0 (varying from −0.9 to −0.1).) These configurations represent different degrees of spatial correlation in the outcome variable and the error term. The innovation term v is simulated as a zero-mean normal random variable with variance following a uniform distribution on the interval [0.5, 4.5]. Table 1 reports the Monte Carlo bias and root mean squared error (RMSE) from 10,000 simulations, as well as empirical rejection probability (P) of the t-test for testing the parameter equal to its true value at 5% for each parameter across the three estimation methods. Four striking observations can be made: (1) The proposed II estimator is almost unbiased in all cases. The GMM estimator is also almost unbiased in all cases (and on some occasions slightly better than the II estimator), but the GS2SLS procedure delivers substantial biases in estimating l, r, and b 1 (the parameter associated with the constant term) under high degree of positive spatial correlation in the disturbance term (r 0 = 0.9), regardless of the value of l 0 . (2) The II estimator achieves the smallest RMSE across the three estimators in the majority of all the cases considered. Under high degree of positive spatial correlation in the disturbance term, the GS2SLS method gives much larger RMSEs (relative to II and GMM) for estimating l and r. This may not be surprising given the substantial biases of the GS2SLS estimator. Also, with r 0 = 0.9, the GMM estimator delivers extremely large RMSEs for estimating b 1 in spite of its small biases. This indicates that in this case there is a huge degree of uncertainty associated with the estimated intercept term from the GMM procedure. (3) Under high degree of positive spatial correlation in the disturbance term, the GMM-and GS2SLS-based t-tests display substantial size distortions for testing l and r and the GS2SLS-based t-test is also severely upward-sized for testing b 1 . In contrast, the II-based t-test delivers very good finitesample size performance in all cases. (4) When r 0 is negatively large, the GMM-based t-test displays non-negligible upward size distortions for testing b 1 and b 2 .
The county contiguity matrix is sparse. One may wonder about the performance of the II estimator under dense spatial weight matrices. 9 Suppose now the elements of the normalized weight matrices are of order O(h −1 n ) such that h n 1 and h n /n 0 as n 1. This corresponds to the scenario when the row and column sums of the (non-normalized) weight matrices might diverge to infinity, as long as the number of cross-sectional units goes to infinity faster. For example, if the inverse distance measure is used in specifying the spatial weight matrix, Elhorst et. al. (2020) showed that this scenario happens when the inverse distance is raised to a positive power. With this modification, one can show that: and Theorems 2 and 3 need to be modified accordingly. While the recentred estimator of l 0 has the typical convergence rate n √ , the recentred estimator of r 0 has a slower convergence rate n/h n √ . It can be shown that the resulting II estimator of r 0 also has the slower convergence rate of n/h n √ and the II estimator of l 0 and b 0 are n √ -consistent. This implies that in finite samples, one may expect poor performance of the II estimator of r 0 under dense spatial weight matrices.  Estimating a spatial autoregressive model with autoregressive disturbances based on the indirect inference principle Table 2 reports results from 10,000 simulations under the circular weight matrices of Kelejian and Prucha (1999), under which each spatial unit has J neighbouring units with J /2 neighbours 'ahead' and J /2 neighbours 'behind'. The exogenous covariates X n (and the corresponding parameter vector b 0 ) and error innovations v n follow the same experimental design as before in Table 1. The SARAR parameters (l 0 , r 0 ) are such that l 0 is fixed at 0.4 and r 0 varies from 0.9 to 0.0. With a sample size of 200, the spatial weight matrices W n and M n (= W n ) display different degrees of density (J = 10, 20, 100, corresponding to 5%, 10%, and 50%, respectively, of the sample size). It can be seen that the GMM procedure produces relatively small biases in estimating l 0 and r 0 , but the corresponding t-test displays substantial upward size distortions. The GMM method has trouble in estimating the intercept term. For given W n and M n , the GS2SLS approach performs worse as the degree of spatial correlation in the error term goes up and for a given u 0 , it performs worse as W n and M n become denser. 10 The GS2SLSbased t-test, similar to that based on GMM, can be severely upward sized in testing the SARAR parameters, especially when W n and M n are dense and/or r 0 is large. The II procedure estimates l 0 reasonably well across different J 's, but can have serious trouble in estimating r 0 as J goes up. This is consistent with the statement earlier that with dense weight matrices the II estimator (of r 0 ) may have a much slower convergence rate. The t-test from the II procedure can also be oversized in small samples, though not as bad as the GMM and GS2SLS procedures. Additional simulation results under other parameter configurations, different degrees of density of the spatial weight matrices, and larger sample sizes are collected in the Appendix in the supplemental data online. 11

EMPIRICAL STUDIES
In this section, two empirical studies are provided. The first one is based on the exercise in Lin and Lee (2010) on county teenage pregnancy rates in 10 Upper Great Plains states in the United States and the second one is on the Airbnb listing prices in the city of Asheville, North Carolina.

Teenage pregnancy rates
Using the data 'Health and Healthcare in the United States -County and Metro Area Data' (Thomas 1999) and the 1990 US Census (US Census Bureau 1992), Lin and Lee (2010) estimated a SAR(1) model by GMM and found strong spatial correlation among county teenage pregnancy rates. The SAR(1) model used in Lin and Lee (2010) is as follows: where Teen i is the teenage pregnancy rate, w ij is the entry from W n (the row-normalized county contiguity matrix), Edu i is the education service expenditure (divided by 100), Inco i is median household income (divided by 1000), FHH i is percentage of female-headed households, Black i is proportion of black population, and Phy i is the number of physicians per 1000 population. 12 As pointed out by Kelejian and Prucha (1998), it is important to test the presence of possible spatial correlation in disturbances. The I 2 (1) of Liu and Prucha (2018) applied to Teen is 317.2698, yielding virtually a p = 0. This indicates strong cross-sectional dependence in the dependent variable. Meanwhile, the I 2 u (1) statistic (I 2 (1) applied to the SAR(1) residuals) is 0.8011 with a p = 0.37, implying that the cross-sectional dependence in disturbances is statistically insignificant. Therefore, the results are consistent with the SAR(1) specification used in Lin and Lee (2010).
Suppose one still proceeds to estimate a SARAR(1,1) model (with M n = W n ), then one would expect that the estimated SAR coefficient in the disturbance should be insignificant. Table 3 reports the estimated parameter values and corresponding t-statistics (absolute values in parentheses) from the GMM, GS2SLS, and II procedures, which yield comparable results. Consistent with the test statistics of Liu and Prucha (2018), the coefficient r is insignificant, while l for the dependent variable is significant. The results support the findings in Hogan and Kitagawa (1985), Jencks and Mayer (1990), Case and Katz (1991), Crane (1991), Evans et al. (1992), and Lin and Lee (2010) regarding the important effect of social interaction on teenage pregnancy. The estimated parameter values of control variables are similar to those reported in Lin and Lee (2010): higher percentage of female-headed households and higher proportion of black population are associated with higher teenage pregnancy rate and factors like education expenditure, median household income and the number of physicians have the opposite effects.

Airbnb listing prices
The new business model of sharing economy has experienced rapid growth in recent years. In a peer-to-peer fashion, individuals rent out underused resources to other individuals in the sharing economy. Airbnb, usually described as a pioneer of the sharing economy, is an online platform that connects individuals seeking to rent accommodation assets with individuals looking for accommodations. The outburst of Airbnb has also attracted attentions from scholars and policy makers. Gutiérrez et al. (2017) and Zervas et al. (2017) studied the impact of Airbnb on the hotel industry. Lee (2016), Barron et al. (2018), and Horn and Merante (2017) investigated how Airbnb affects the housing market. Fang et al. (2016) explored the effect of Airbnb on tourism industry employment.
It is wildly acknowledged that price is one of the most critical factors in the long-term success of the accommodation sector (Hung et al. 2010). Many studies have explored the price determinants of Airbnb's shared accommodations. For example, by examining accommodation offers from 33 cities listed on Airbnb, Wang and Nicolau (2017) found that there are five categories of price determinants: host attributes, site and property attributes, amenities and services, rental rules, and online review ratings. Benítez-Aurioles (2018a, 2018b) explained the role of distance to city centre and flexible cancelation policies in Airbnb's listing prices. Ert et al. (2016) found that the level of host trustworthiness, mainly inferred from listing photos, affects listing prices and the probability of being chosen. However, the aforementioned papers did not take into consideration of spatial correlation in Airbnb's listing prices. By using micro and aggregate data of accommodation prices listed on Airbnb in the urban area of Madrid, López et al. (2020) estimated a spatial seemingly unrelated regressions hedonic model and they found statistically significant spatial correlation.
In this paper, SARAR(1,1) is applied to Airbnb accommodation log prices in Asheville, the largest city in Western North Carolina in the United States. There are in total 2247 accommodation offers in the sample. 13 The set of explanatory variables used are listed and defined in Table  4, corresponding to the five categories of price determinants as in Wang and Nicolau (2017). The weight matrix W n is specified as row-normalized J -nearest neighbour weight matrix with J = 20, 50, 100 and M n = W n . Table 5 shows the estimation results. The estimated parameter values and corresponding t-statistics (in absolute values) are quite similar across three different estimation procedures. 14 One can see that the coefficient l is statistically significant and indicates stronger degree of spatial correlation as the number of nearest neighbours goes up. This result is consistent with López et al. (2020). In contrast to l, while the II method indicates absence of spatial correlation in the disturbance term, the GMM and GS2SLS methods report statistically significantr when J = 20. Given the more reliable performance of the II approach as indicated in the Monte Carlo experiments (when W n and M n are relatively sparse), it is more reasonable to believe that there is little evidence of spatial correlation in the disturbance term. The parameter estimates of coefficients of control variables are similar to those reported in Wang and Nicolau (2017) and López et al. (2020). It is interesting to note that Wi-Fi does not seem to affect prices, suggesting that it is perhaps taken as granted in the sharing economy of Airbnb. The number of bathrooms appears to be far more important than the number of bedrooms in property attributes, hinting that bathroom privacy is valued much more in this market. While a higher review score gives rise to a higher price tag, the number of reviews per month indicates the opposite, consistent with the phenomenon that dissatisfied customers are more likely to leave reviews, usually very critical, than happy guests.

CONCLUSIONS
This paper considers the II estimation method of SARAR(1,1) model by matching the OLS estimator of the two SAR coefficients (one in the outcome equation and the other in the error process) with its approximate analytical expectation. It is shown that the resulting II estimator is consistent, asymptotically normal, and robust to unknown heteroscedasticity. Compared with the existing estimators that rely on IV and some moment conditions associated with the error innovation term, the II estimator is found to perform better in a Monte Carlo study that Estimating a spatial autoregressive model with autoregressive disturbances based on the indirect inference principle uses a sparse county contiguity weight matrix. Moreover, when the degree of spatial correlation in the disturbance is high, inference procedures based on other methods can lead to severe upward size distortions, but the II-based t-test delivers very good size performance. However, when dense spatial weight matrices are employed, the estimators, including II, do not perform so well in small samples. The new estimation procedure is applied to empirical studies on teenage pregnancy rates and Airbnb accommodation prices, showing strong presence of spatial correlation in the outcome variables but little evidence of correlation in disturbances for both cases. For future research, it is of interest to apply the II estimation method to higher order SARAR models as in Badinger and Egger (2011), Lee and Liu (2010) and Jin and Lee (2019), among others. Again, the existing literature is largely rooted in the IV/GMM framework. The II approach aims to rely on no IV or linear and quadratic moment conditions. Another possible extension is to consider spatial panel models as in, among others, Lee and Yu (2010a, 2010b, 2010c, Baltagi et al. (2013), Elhorst (2014), and Catania and Billé (2017).
In this paper, the spatial weight matrices M n and W n are taken as given. In the empirical study of Airbnb accommodation prices, different weight matrices based on nearest neighbours are used and no attempt was made to decide which weight matrix specification gives the best performance. One may follow the approach of Kelejian and Piras (2011) to consider a test that compares the prediction power from a null model and that from an alternative model, where in its first step, one needs to estimate model parameters under each model specification. Another approach may be to follow Lam and Souza's (2020) LASSO strategy in selection of the weight matrices, where the LASSO objective function is based on some distance measure constructed using IV's. It would be interesting to explore testing strategies using the II estimator in the first step of Kelejian and Piras (2011) or the sample binding functions in the LASSO objective function of Lam and Souza (2020) and this is left for future research.