Quantifying Causal Effects of Road Network Capacity Expansions on Traffic Volume and Density via a Mixed Model Propensity Score Estimator

Road network capacity expansions are frequently proposed as solutions to urban traffic congestion but are controversial because it is thought that they can directly “induce” growth in traffic volumes. This article quantifies causal effects of road network capacity expansions on aggregate urban traffic volume and density in U.S. cities using a mixed model propensity score (PS) estimator. The motivation for this approach is that we seek to estimate a dose-response relationship between capacity and volume but suspect confounding from both observed and unobserved characteristics. Analytical results and simulations show that a longitudinal mixed model PS approach can be used to adjust effectively for time-invariant unobserved confounding via random effects (RE). Our empirical results indicate that network capacity expansions can cause substantial increases in aggregate urban traffic volumes such that even major capacity increases can actually lead to little or no reduction in network traffic densities. This result has important implications for optimal urban transportation strategies. Supplementary materials for this article are available online.


INTRODUCTION
Road network congestion, which is characterized by travel times in excess of those under free-flow conditions, is a phenomenon experienced in most major urban areas around the world. Transport engineers generally view congestion as undesirable because it can impose large costs on road users and the economy through delay and reduced traffic flow. A common intervention used to ameliorate the problem is to build more roads in the hope of reducing traffic densities and accommodating future growth in traffic volumes. This strategy is, however, controversial because some believe that network capacity expansions themselves cause a direct rise in aggregate traffic volumes, a phenomenon known as "induced demand." The theory of induced demand is widely known (for key concepts see Small and Verhoef 2007;Kelly 2008). In principle, it can occur via the following causal process. A capacity expansion (i.e., increase in lane miles) is introduced which causes an immediate reduction in traffic density (i.e., traffic volume to capacity ratio) on the network and consequently network congestion falls. With less congestion, average travel times decrease causing the volume of aggregate traffic to rise in response because the cost of travel, in terms of time spent, has effectively fallen. As traffic volumes rise, travel times start to increase as congestion again becomes prevalent. In theory these interactions between demand and cost can yield a new equilibrium with a higher volume of network traffic than prior to the capacity expansion. Depending on the extent of induced demand, undesirable consequences of road use such as congestion, pollution, and accidents could worsen. Many U.S. cities have experienced article, we estimate the relationship between network capacity and aggregate traffic volumes and densities in U.S. cities using a novel mixed model generalized propensity score (GPS) estimator. To our knowledge, this is the first time that induced demand has been studied using a statistical model based on the potential outcomes framework for causal inference. The original GPS approach was introduced by Imbens (2000) and Hirano and Imbens (2004) for estimation of multivalued and continuous treatment effects with measured confounding. Here, we provide a longitudinal mixed model extension of the GPS estimator which can accommodate both measured time-varying confounding and unmeasured time-invariant confounding, as well as bidirectionality between response and treatment assignment. This is attractive for the type of application we present for a number of reasons. First, it addresses the three key estimation issues emphasized in previous literature. Second, it allows us to estimate a flexible semiparametric dose-response function rather than a single summary point estimate, which can reveal heterogeneity in treatment effects by dose. Third, the formal framework for causal inference that we adopt requires us to ensure that the sample for causal comparison is valid, that is, that it comprises units that are genuinely comparable and so focuses the inference problem on the relevant subsample and avoids undue extrapolation.
The article is structured as follows. Section 2 explains the properties of mixed model GPS methods for continuous doseresponse estimation. Section 3 illustrates some properties of the mixed model PS approach through simulation. Section 4 presents results from our case study application. Section 5 concludes and outlines some issues for future research.

A MIXED MODEL PROPENSITY SCORE (PS) APPROACH FOR ESTIMATION OF CONTINUOUS
TREATMENT EFFECTS

The PS for Continuous Treatments in a One-Period Setting
In this article, we are concerned with estimating causal quantities associated with a continuous treatment. We consider first the one-period setting: the observed data are realizations of a random vector, W i = (Y i , D i , X i , U i ), i = 1, . . . , N, where for the ith unit of observation Y i denotes a response, D i the treatment received, X i is a vector of covariates which we assume are sufficient to represent all sources of confounding, and U i is a vector of covariates that are not confounders because they are independent of Y i but not D i . The treatment can take values in some bounded interval in R, D ⊆ R, and we define the indicator for receipt of treatment level (or dose) d as For any treatment level d ∈ D, we assume the existence of a potential outcome for unit i, which we denote Y i (d). The set of all potential outcomes is Y i = {Y i (d), d ∈ D for i = 1, . . . , N} and the full data for estimation of causal effects are taken to be (Y i , D i , X i , U i ). Unobserved potential outcomes are then treated much like missing data.
For continuous treatments, Hirano and Imbens (2004) showed that observance of actual outcomes is sufficient to estimate av-erage potential outcomes (APOs) for a population, even under nonrandom treatment assignments, so long as three key assumptions hold. First, to address confounding we must be able to assume "ignorability," or conditional independence between the response and treatment assignment given the covariates X i : Second, to ensure comparability across potential outcomes there must be common support, or overlap, by treatment status in the covariate distributions within some region of dose C ⊆ D. A sufficient condition is that for any subset of C, say A ⊆ C, Pr(D i ∈ A|X i = x) > 0 for all x. Third, for logical consistency there must be equivalence between the observed response under a given dose d and the potential response under that dose: and for i = 1, . . . , N. One important implication of this is that the stable unit treatment value assumption (SUTVA) (e.g., Rubin 1978) must hold, which requires: (i) that the outcome for each unit be independent of the treatment status of other units, or in other words, there should be no interference in treatment effects across units; and (ii) that there are no different versions of the treatment.
If these assumptions hold, then the APO under a given dose , or the dose-response function, can be derived as where the second equality follows from ignorability, the third from the SUTVA, and the overlap assumption ensures that the APO is estimable since there are comparable units across treatment levels. In the literature, the subsample of observations trimmed to ensure overlap in covariate distributions is referred to as the "common support region." Average treatment effects (ATEs) can then be calculated by comparing the APO at different treatment levels. Note that since observations in the common support region may not be representative of the overall population, and since treatment effects may be heterogeneous across individuals, causal parameters (i.e., ATEs and APOs) estimated under imposition of common support will not in general be equivalent to the corresponding population parameters (see, e.g., Crump et al. 2009). For continuous treatments Hirano and Imbens (2004) showed that we can use a GPS in place of covariates X i to adjust for confounding. Let r(d, x i ) = f D|X (d|x i ) denote the conditional density function for receiving a particular level (dose) of the treatment (d) given pretreatment variables X i = x i . The GPS for dose d is then defined as a random variable, which we denote R d,i = r(d, X i ), a scalar function of X i for fixed d. We may also define the GPS for observed doses, R i = r(D i , X i ), which is evaluated at the level of treatment actually received and is a random variable defined with respect to the joint distribution for observed (D i ,X i ). Clearly when Two key properties of the GPS are (i) that balancing follows directly from its definition: I D i (d) ⊥ ⊥ X i |r(d, X i ); and (ii) given balancing, ignorability can be established conditional on the scalar GPS rather than the potentially high-dimensional covariate vector X i : Y i (d) ⊥ ⊥ I D i (d)|r(d, X i ). Using these two properties Hirano and Imbens (2004) showed that if ignorability holds, and the model is correctly specified, the GPS provides a bias-removal strategy in the context of continuous treatments: In practice, the following algorithm can be used to estimate the dose-response curve via the GPS: 1. Estimate a model for the conditional density of D given X with parameter vector θ : f D|X (d i |x i ; θ ). 2. Use estimated θ, with the assumed density function used for Step 1, to calculate the observed and unobserved GPS values: Repeat for all doses of interest. 6. Use a single bootstrap resampling scheme over Steps 1 to 5 for variance estimation.
Details on each step of the algorithm are included in the online supplementary materials.

PS Construction
The success of PS estimators relies on having an observed covariate vector which is sufficient to represent confounding. Ignorability and the balancing property of the PS will not necessarily hold given only a subset of confounders, and consequently, the PS may not eliminate all sources of bias (e.g., Pearl 2000Pearl , 2009. In the context of binary treatments, Rubin (1997) argues against inclusion of nonconfounding covariates in the PS, even if they have an important influence on assignment, because it can have an adverse effect on efficiency of causal models. In this subsection, we consider this issue in relation to the GPS as it has important implications for the mixed model approach that we propose below.
We refer to a GPS that conditions on confounding and nonconfounding covariates as a fully conditional GPS (FCGPS). Let r * (d, x i , u i ) = f D|X,U (d|x i , u i ) represent a value from the conditional density of D i = d given both X i and U i , which can be used to form the FCGPS random quantities R There are five properties of the FCGPS relevant for dose-response estimation that we state here. Detailed results and proofs for each of these are available in supplementary material.
i. The FCGPS retains the balancing property such that: ii. Ignorability can be established conditional on the FCGPS rather than covariate vector X i : iii. If assignment to the treatment is ignorable given baseline characteristics X i , then conditioning on the FCGPS eliminates biases associated with differences in those confounding covariates. Consequently, the dose-response function can be computed iv. Estimates of the dose-response function based on the true (data generating) GPS will be at least as efficient as those based on the FCGPS. v. The FCGPS is drawn from a conditional density for d that has smaller variance than the conditional density from which the GPS is drawn. The implication of this result is that, at any level of the treatment, the FCGPS conditional density will have greater mass concentrated over a smaller range of units. Consequently, the common support region established using the GPS will be at least as large as that using the FCGPS.
In short, the FCGPS retains the properties of balancing and ignorability and will therefore still provide unbiased, although potentially less efficient, estimates of the dose-response function under correct model specification. However, redundant conditioning may reduce the size of the sample available to estimate causal quantities because it will be more difficult to find common support over the covariate distributions.

Mixed Models for PS Estimation in the Multi-Period Setting
So far we have assumed that the vector of observed covariates X i is sufficient to represent confounding but in practice some confounding covariates may be unobserved, or even unknown. In these cases, and particularly when longitudinal data are available, we propose use of a mixed model approach to adjust for time-invariant unobserved confounding. We focus here on the case of a Gaussian treatment where a linear mixed model (LMM) is appropriate. Similar results for non-Gaussian treatments can be derived using the same principles for generalized linear mixed models (GLMMs).
A LMGPS model for a longitudinal data structure, comprising N units (or subjects), i = 1, . . . , N, each of which has n i measurements made over times t, t = 1, . . . , n i , giving a total of n = N i=1 n i sample observations, takes the form where X T it θ 1 is the fixed effects part of the model with p-dimensional design vector X it and parameter vector θ 1 , Z it is the q-dimensional design matrix for the vector of random effects (RE) θ 2i ∼ N (0, G), and e T i = (e i1 , . . . , e in i ) ∼ N (0, H i ) is a vector of random errors. Using bold to denote matrices, G and H i are (q × q) and (n i × n i ) positive definite covariance matrices of RE θ 2i and error vector e i for unit i respectively, and we assume that θ 2i and e i are independent within subjects and θ 2i and e it are independent across subjects given the covariates (see Diggle et al. 2002, for a discussion of these assumptions and the conditions under which they may be violated).
The RE component of LMMs can essentially be used as a target of inference in one of two ways: in relation to the population from which the RE is drawn, or in relation to the prediction of the realized value of the RE for unit i (for a discussion of this distinction see Ruppert, Wand, and Carroll 2003;Mc-Culloch, Searle, and Neuhaus 2008;Fitzmaurice and Molenberghs 2009;Raudenbush 2009). The former model may be interpreted via marginalization as

A Random Intercepts GPS Model to Address Time-Invariant Confounding
For GPS estimation in the presence of unobserved confounding, we propose a subject-specific interpretation of the mixed model, is a random intercept. By specifying random unit level intercepts within the GPS, rather than the outcome model, they are by construction rendered correlated with treatment assignment and therefore potentially useful in representing sources of time-invariant unobserved confounding. The LMGPS approach can be further generalized in two ways. First, we can relax the implicit assumption that the RE are uncorrelated with the covariates (i.e., E[b i |X it ] = 0) by using a mean centered model, which adds time-averaged values of the covariates (i.e., Second, we can introduce dynamics to allow confounding from lagged doses by specifying a potentially autoregressive error term ε it = ρε it−1 + ν it , with |ρ| < 1 and ν it ∼ N 0, σ 2 . This dynamic model, with either correlated or uncorrelated RE, is equivalent to an autoregressive distributed lag (ADL) model of order (1,1), that is from which we can recover an estimate of dynamic confounding (ρ). If appropriate the response history through time t − 1, which we denote H Y i (t − 1), can be added as an exogenous predetermined covariate in the LMGPS models.
Specification of RE in the GPS model can help capture timeinvariant unobserved confounding and therefore improve the plausibility of the ignorability assumption. Using the GPS algorithm, with a mixed model specification for Step 1, we can estimate two key parameters of interest: the APO, However, since the predicted RE in the LMM cannot distinguish between sources of variation that arise from time-invariant confounding or nonconfounding characteristics, adverse effects on efficiency and evaluation of common support may arise as discussed in Section 2.2. It is also worth noting that in adopting an LMM approach we are obliged to assume a model for the RE, which if misspecified can adversely effect estimation of random and fixed parameters. In applied work, for instance, normality of RE is typically assumed for computational convenience but in some applications could result in a misspecified model. A summary of the consequences of misspecification and some potential solutions via nonparametric methods were reviewed by Huang (2011) while Abad, Litire, and Molenberghs (2010) proposed some useful diagnostics tests.

Comparable Approaches
There are other prominent approaches for estimation of continuous treatment effects in the longitudinal setting that have been discussed in the literature and that could potentially address time-invariant unobserved confounding.
1. Outcome regression (OR)-standard longitudinal OR approaches, using correlated RE or fixed effects, can be used to address time-invariant confounding (see Diggle et al. 2002). If the OR model is correctly specified and there is overlap in the sample such approaches will tend to be more efficient that the multistep GPS approach. Key advantages of the GPS over OR models arise in working with a scalar value rather than a potentially high-dimensional covariate vector. This allows for effective use of flexible approaches in modeling potentially nonlinear dose-response functions, for instance via semiparametric and polynomial regressions with interaction terms. Moreover, it is highly effective in isolating the region of common support, a task that is difficult using multiple covariates (for discussion see Joffe and Rosenbaum 1999). The GPS approach is of course more involved, with multiple modeling steps, but these generate additional statistical summary information which can be highly informative both for the inference procedure and for substantive or policy issues. 2. Difference-in-differences (DID) GPS (DIDGPS)- Flores et al. (2012) implemented a difference-in-differences GPS estimator which transforms the response into before-after outcomes and then applies the GPS approach for confounding adjustment. This is a continuous treatment counterpart to the conditional generalized DID estimator for binary treatments proposed by Heckman, Ichimura, and Todd (1997) and Heckman et al. (1998). Differencing outcomes allows for time-invariant unobservable factors to influence selection and consequently reduces bias when the data are contaminated by temporally invariant components. Since the RE are differenced out rather than explicitly specified, DID approaches do not require an assumed model for RE as in the LMM approach. Unlike the LMGPS, however, this estimator requires that the time-invariant unobserved confounders be related to the response linearly and that there be pretreatment observations available. Furthermore, the extension to longitudinal treatments is not straightforward, and also the approach sacrifices some efficiency via data loss through differencing. (2012) proposed a repeated measures, or multivariate, longitudinal generalization of the GPS approach which analyses the effect of treatments applied over time allowing for confounding from lagged doses and responses. They showed that sequential conditional independence (i.e., by-period conditional independence of treatment and response given response and treatment histories) can be established using the MGPS. The MGPS approach is not explicitly designed to address time-invariant confounding but since the lagged doses and responses are in turn influenced by time-invariant RE it could reduce bias from these sources. 4. Marginal structural models (MSMs)-MSMs (Hernán, Brumback, and Robins 2001) have been used extensively to analyze treatment effects in the longitudinal setting. The approach is explicitly formulated to address the problem of time varying confounding using the underlying principle of inverse probability weighting to create pseudopopulations. Primarily, MSMs have been used to estimate the total effect of a treatment regime on an end-of-study outcome, but they can also be used to estimate the direct treatment effect as an average value over all periods (for a discussion see VanderWeele 2009). Weighting-based approaches such as MSMs, are quite different from the regression-based approaches that we adopt. As we have shown, the mixed model approach can help to account for unmeasured confounding in regression, yet to incorporate such structures into weighting-based approaches requires further methodological investigation.

Multivariate GPS (MGPS)-Moodie and Stephens
In the next section, we present simulations to illustrate the properties of the LMGPS approach in relation to other comparable longitudinal estimators for continuous treatments.

SIMULATIONS
Our simulations are conducted on samples of 1000 observations comprising 100 subjects each observed at 10 time points. We index subjects by i, i = (1, . . . , N) and time points by t, t = (0, . . . , n i ) giving a total of n = N i=1 n i sample observations. There are three normally distributed independent covariates: X 1it ∼ N (0.2, 0.1), X 2i ∼ N (1.0, 0.6), and U i ∼ N (0.2, 0.1). Covariate X 1it varies over time, but is not a function of time, while X 2i and U i are time-invariant.
We specify the following relationships between the covariates and the treatment assignment D and response Y: Thus, X 1 and X 2 are confounders for D in a nonlinear relationship with Y, while U is a nonconfounding covariate. Note that due to the nonlinear nature of the relationship between Y and X 2 , differencing will not eliminate time-invariant confounding. The simulation set up is chosen to demonstrate the dual challenge of addressing time-invariant unobserved confounding with a nonlinear dose-response relationship. We assume that the analyst is ignorant of the data-generating process. The following models for the dose-response are tested 1. μ(d) OR1 -a linear OR model based on an incorrectly specified generalized linear model (GLM), with erroneous exclusion of the time-invariant confounder X 2i : The doseresponse is derived by taking the mean of the predicted values of this model evaluated at D it = d.
2. μ(d) OR2 -as model OR1 but with RE (η i ) specified for each subject: OR3 -as model OR2, but estimated using a flexible generalized additive mixed model (GAMM) specification 4. μ(d) GPSt -a GPS dose-response estimator based on a correctly specified linear GPS model: GPSf -a GPS dose-response estimator based on a incorrectly specified GPS model, with erroneous exclusion of the time-invariant confounder X 2i . 6. μ(d) FCGPS -a GPS dose-response estimator based on a incorrectly specified GPS model, with inclusion of covariates X 1it , X 2i, and U i . 7. μ(d) LMGPS -a linear mixed GPS model dose-response estimator, with exclusion of covariate X 2i , but inclusion of unit level RE. 8. μ(d) MGPS -multivariate GPS Markov model which excludes time-invariant confounder X 2i , but adjusts for lagged values of dose and response : The parameters being estimated are population APOs. We impose common support in the GPS based models to illustrate the effects of conditioning on sample size, but in our particular set up it does not affect the value of the estimated parameters. For GPS models, the conditional density of the treatment given the covariates is estimated using a Gaussian GLM or GLMM, observed and unobserved GPSs are calculated via the normal density, and the conditional expectation of the log of the outcome given D and R is approximated using a GAM approximation, with smooth main effect functions for D and R and a smooth interaction. Using a method detailed in the supplementary materials we selected the largest common support region in which the estimated probability of assignment to each dose ≥ 0.01. All models are specified with a log transformation of the outcome and results are presented on the scale of the (log) response. The simulations are run on 1000 generated datasets of size 1,000. The mean dose is 4.0 and APO estimates are calculated for doses 3, 4, and 5. Table 1 reports average estimates (Av Est), average estimated variances (Av Est Var), empirical variances (Emp Var) calculated via 1000 bootstrap replications, mean squared error (MSE), coverage based on bootstrap bias-corrected adjusted 95% confidence intervals, and the size of common support regions (S).
The first column of numbers in Table 1 confirms that the size of the common support region will tend to reduce with more extensive conditioning (i.e., FCGPS and LMGPS) relative to the true GPS (GPSt), and is underestimated by GPSf and MGPS. Note that average estimated variances and empirical variances are much the same for models OR1, OR2, OR3, and GPSf; but not for the remaining models. This is because the imposition of common support reduces sample size and consequently inflates estimated variances, but it does not affect bias in this particular simulation set up and so the empirical variance of estimates across datasets is unaffected.
The OR dose-response estimates are biased and correspondingly have poor coverage. There are two effects at play here. For all specifications (OR1, OR2, and OR3) the effect of the treatment is confounded due to the erroneous omission of the time-invariant confounder X 2 . The inclusion of RE in OR2 and OR3 fails to adjust for this source of confounding, since by construction, the RE are independent of the covariates and therefore capture time-invariant unobserved variables that are strictly nonconfounding. The second effect, relevant to OR1 and OR2, is that the semi-log-linear model specification fails to provide a good approximation to the true nonlinear nature of the doseresponse function. With a high-dimensional covariate vector this effect may be evident due to the limited potential for use of highly flexible functional forms.
With the exception of model specification GPSf, the GPS models generally perform well in estimating the dose-response and coverage is good, even though these model still involve approximations to the true nonlinear dose-response function. This is because they offer a high degree of flexibility in approximating the unknown function while providing adequate adjustment for confounding covariates. Note that use of a best linear unbiased predictor (BLUP) to calculate the LMGPS gives similar results to those obtained for the FCGPS, even though this model does not include the time-invariant confounder explicitly. This is because the RE in LMGPS are correctly located in the sense that they do adjust for confounding, in addition to other sources of unit level variance. Through inclusion of lags in both treatment and response the MGPS is reasonably robust to the exclusion of time-invariant RE, but this estimator is not designed to address this particular issue directly and consequently we find that it does not perform as well as the LMGPS far from the mean dose. Excluding time-invariant confounding covariates from GPS estimation, as in the GPSf specification, produces poor estimates of the dose-response.

CASE STUDY: QUANTIFYING CAUSAL EFFECTS OF ROAD NETWORK CAPACITY EXPANSIONS ON TRAFFIC VOLUME AND DENSITY
We now apply the method outlined above to our longitudinal case study on the effect of road network capacity expansion on aggregate traffic volumes. The data available for analysis describe mobility and traffic conditions for 101 major U.S. cities over the period 1985 to 2010. They are collated by the Texas Transportation Institute (TTI) at the University of Austin Texas and are publicly available with full documentation at http:// mobility.tamu.edu/ums/. A description of the data and a table of summary statistics is provided in the supplementary materials.
The available longitudinal data allow us to represent three key features of our problem: response (y it ), treatment (d it ), and confounding covariates (x it ). We define our response variable as annual proportional change in aggregate urban traffic volume, with volume measured by annual vehicle miles traveled (vmt) in each city i: y it = vmt it /vmt it−1 . Our "treatment" variable is a measure of the proportional change in urban road lane miles (lms) in each year: d it = lms it /lms it−1 . The advantage of using proportional changes for treatment and response is that, since we can condition on initial scale in the GPS regression, we can obtain APO estimates that have an intuitive and general interpretation as the proportional change in vmt caused by a proportional increase in capacity.
The previous literature offers guidance on likely sources of confounding. To model time-varying confounders we use the available longitudinal data to construct the following covariates, each measured in the year immediately prior to the treatment: (i) lagged-response (to allow for reverse causality), (ii) congestion (annual hours of delay per vmt), (iii) traffic volume (vmt), (iv) network scale (lms), (v) network composition (freeway lms / arterial lms), (vi) traffic composition (arterial vmt / freeway vmt), (vii) mode share (annual passenger miles traveled by public transport), (viii) productivity (metropolitan wage rate in dollars per annum), (ix) economic structure (metropolitan share of manufacturing jobs), (x) employment and population distribution and growth (covariates measuring the levels and proportional growth), (xi) personal income (average metropolitan personal income in dollars per annum), and (xii) state fuel price (dollars per gallon). Further detail on the hypotheses underpinning inclusion of these covariates, and summary statistics for the data, can be found in the supplementary materials.
As discussed in the introduction, the previous literature tends to use model specifications that allow for confounding from unobserved time-invariant city level characteristics. These could include physical layout, topography, climate, geography, cultural factors, and features of network engineering. Thus, in addition to the time-varying confounders we draw on a mixed model framework for the GPS. We use a mean centered LMGPS which, as explained in Section 2, provides the most general specification in that it allows for correlation between the time-invariant RE and the time varying covariates. By virtue of the fact that a lagged response features as a covariate in the GPS model, correlation with the RE should be assumed. Our mean centered LMGPS thus accommodates the key issues surrounding induced demand estimation raised in the literature. For comparison and testing, a number of other GPS based models and conventional longitudinal models are estimated. We summarize these results below and provide full tables and statistics in the appendix.
The LMGPS approach was illustrated through simulation in the previous section. In the application an identical procedure is followed. We first estimate the conditional density of treatment given covariates. To allow flexibility of form, we use a Gaussian penalized spline model with smooth main effect functions for each covariate and automatic knot selection, estimated using the GAMM4 package in R. Estimation is by restricted estimation maximum likelihood (REML) and approximate results on significance are given in Table 2. The p-values, which derive from a Wald test of individual smooth terms for equality to zero, are based on a statistic with an approximate F distribution. As described by Wood (2006) the p-values should be interpreted with caution as they are typically lower than they should be when the null is true creating a tendency toward over-rejection of the null. Using the Bayesian information criterion (BIC) of this model as a guide we found that a more parsimonious specification could be achieved by dropping some covariates related to population and employment distribution and growth. The final set of covariates used is shown in Table 2.
Using the method detailed in the appendix we select a region of common support in which the estimated probability of assignment to dose ≥ 0.01. This yields a region of 1353 comparable observations. The large exclusion of observations (≈ 45%) required to achieve comparability implies that estimated APOs and ATEs will likely differ from the corresponding population parameters. Tables A1 and A2 in the appendix provide detail on differences in characteristics between the common support region and the full sample. Most noticeable is that smaller cities are somewhat under-represented in the common support region and consequently we find that the mean values for traffic volume, network scale, productivity, population, and employment are larger than for the full sample.
We next test for the balancing property of the GPS. In a similar manner to Hirano and Imbens (2004) and Flores et al. (2012), we regress the treatment on the covariates, the GPS, and a set of indicator variables corresponding to the following treatment discretization:  BIC values obtained from the linear regression models with and without covariates, −21371 and −21414, respectively, indicate that the inclusion of covariates leads to a deterioration in model adequacy. We also conduct an F -test between the restricted (without covariates) and unrestricted models giving an F statistic (p-value) of 1.250 (0.264). These results suggest that the balancing property has been achieved for our GPS specification. Next, we approximate the conditional mean of the outcome given treatment level and the estimated GPS. We do so using a Gaussian penalized spline GAM estimated by REML, with automatic knot selection and smooth main effect functions for d and R. To test the validity of this simple specification, rather than use a rectangular grid of knot points in the (d, R) plane, we choose knots for the R component tailored to a range discretization of d into 10 strata. The BIC of this model (-6416) does not support the need for separate smooths, and in fact the model generates very similar dose-response estimates to those obtained with a single smooth. This test provides only a rudimentary check for model specification, but it does also demonstrate the considerable flexibility in form offered via the GPS approach relative to OR approaches with many covariates.
We do not observe the set of potential outcomes across doses of interest. Instead, we construct the dose-response function using the conditional mean-response model by taking the mean of its predicted values on the response scale, evaluated at D = d and R d,it = r(d, x it ). We calculate the dose-response for increments of 0.0025 over the range of doses from 1.0 to 1.05 (i.e., from 0% to 5% capacity expansion). Results for selected doses are shown in Table 3 with bootstrap standard errors. The bootstrap calculations are based on 500 block replications given both the GPS estimation and the averaging required to estimate the dose-response. In addition to the dose-response estimates, we also report ATEs defined as the expected proportional change in response under any given capacity expansion net of the expected proportional change that would occur under no capacity  Table 3 shows our estimated responses and ATEs at a selection of doses. The response value corresponding to dose 1.000 represents average "natural growth" in traffic (i.e., average growth under no treatment), which we estimate to be 1.4% per annum. All other average response estimates shown in the table, which are associated with network capacity increases of various doses, indicate evidence of induced road traffic demand having adjusted for observed time-varying and unobserved time-invariant confounders and having ensured common support. The bootstrapped standard errors indicate statistically significant effects pointwise at all doses. Figure 1 illustrates the ATEs graphically with bootstrapped 95% pointwise confidence intervals. Note that the ATEs are increasing with dose across the range of treatment considered, and the effect is statistically significant at all doses. This implies that city dwellers have tended to experience larger traffic volumes due to network expansions with potentially adverse implications for pollution and the incidence of collisions.
We can transform our ATEs to the elasticity scale typically reported in the existing literature by dividing the ATE estimate by the corresponding proportional change in lane miles. For our LMGPS model, this gives a range from 1.393 (0.539) to 0.435 (0.112) with a mean of 0.772 (0.170). The elasticities decrease from the smallest to largest dose. Existing empirical studies quote single point elasticity estimates, typically under 0.700, although the recent study by Duranton and Turner (2011) which uses IV as a causal method reports a larger elasticity of around 1.000. Our results indicates that there may be considerable heterogeneity in response across different doses, and therefore that there is value in adopting a causal dose-response approach rather than a conventional OR approach for a single point estimate.
The results can also be used to indicate the effect of capacity expansions on traffic densities by taking the ratio of response to doses. Our estimates suggest that, given the combined effects of natural growth and induced demand, increases in network capacity of less then 3.5% may not cause traffic densities to reduce. Furthermore, for our original sample of 101 cities over 26 years, only 533 observation (20%) have experienced annual growth in lane miles greater than 3.5%. So, the induced demand effect suggested by our estimates really is substantial, implying that even major network capacity expansions may not actually cause traffic densities to fall, and by extension, may do little to ameliorate urban congestion.
For illustrative purposes and to provide further insight into the LMGPS methodology, we generate additional results from alternative model specifications proposed in the literature. These results are summarized below and presented in full in the appendix.
Alternative GPS model specifications. We estimate a GPS model with covariates but without adjustment for time-invariant confounding, and an MGPS model as described above. These models, which do not adjust directly for time-invariant confounding, give substantially similar results to the LMGPS estimates, but the estimated induced demand effects is larger. The GPS model gives a mean elasticity of 0.804 (0.163) and the MGPS model 1.128 (0.166).

Models using conventional panel OR approaches.
We estimate the following models: pooled OLS (POLS) with and without covariates, RE, fixed effects (FE), first difference (FD), and dynamic-panel generalized method of moments (DPGMM) (see Hall 2005). In all cases, the treatment is specified in quadratic form. If we assume that the covariate vector correctly represents time-varying confounding, then the POLS model without covariates will give inconsistent estimates, the POLS with covariates and the RE models will give consistent estimates if there is no time-invariant confounding, and the FE and FD models will give consistent estimates in the presence of both timevarying and time-invariant confounding, as will the DPGMM model but it also allows for confounding from lagged values of the response. Note that common support is not imposed in these conventional approaches implying two important points for interpretation and comparison: first, the causal parameter being estimated may differ from the corresponding value for the common support region; second, parameter estimates are calculated without concern for the comparability of units and thus a key requirement of the potential outcome framework for causal inference is not met.
The POLS model without covariates makes no adjustment for confounding and gives a mean elasticity of 1.121 (0.064), somewhat higher than values typically reported in the literature. The POLS model with covariates and the RE model give mean elasticities that are broadly similar to the mean value for the LMGPS model, 0.769 (0.096) and 0.785 (0.086) respectively, but in fact the pattern of ATE estimates from these models is substantially different indicating considerably less heterogeneity in treatment effect by dose. The FE, FD, and DPGMM models, which adjust for time-invariant confounding, produce results that are similar in magnitude to those reported in the literature with mean elasticities of 0.530 (0.084), 0.597 (0.090), and 0.611 (0.296) respectively, but again show less heterogeneity in ATEs than indicated by the GPS based models.
All estimates from the additional models indicate evidence of induced road traffic demand from network expansion, but the magnitude of estimated effects varies. Adjustment for time varying confounding appears to produce lower estimates and additional adjustment for time-invariant confounding appears to reduce the estimates still further. The FD and FE panel data models show improved explanatory power, as measured by adjusted R 2 , and the Hausman test selects the FE over RE model. These diagnostics are consistent with the existence of time-invariant confounding. However, while the panel models provide a useful perspective which is broadly supportive of the LMGPS approach we propose, since they do not address common support and are based on a restrictive a priori assumption about the form of the dose-response, direct comparison of results in inhibited. In summary, under the assumptions of our model, we find evidence of a positive causal relationship between network capacity expansions and traffic volumes. The scale of the effect indicates that network capacity expansions may do little to reduce traffic densities other than in extreme cases (i.e., expansions of 3.5% and above). This implies that as a remedy for congestion, capacity expansion is at best a risky strategy and city authorities should be aware of the potential scale of the induced demand effect.

CONCLUSIONS
In this article, we have studied the effect of road network capacity expansions on aggregate traffic volumes and densities in U.S. cities using a linear mixed model GPS approach for continuous dose-response estimation. Given the assumptions of our model, our results suggest that capacity expansions can give rise to a direct increase in aggregate urban traffic volumes, and we find that the effect may be substantial such that even major capacity increases can actually lead to little or no reduction in network traffic density. One implication is that by building more roads in major urban areas we may create increasing pollution, congestion, collisions, and other negative consequences of urban motoring. On the other hand, we may also permit more mobility for urban dwellers.
GPS based estimators are attractive because they permit the use of flexible approaches that make minimal a priori assumption on the form of the dose-response relationship and they can be used to find a sample comprising observations that are broadly comparable for treatment effect estimation. The methodological key insight in this article is that by specifying RE within a GPS rather than mean-response model, they are rendered correlated with treatment assignment and therefore potentially useful in representing sources of unobserved time-invariant confounding. Our proposed estimator provides a general specification that could be useful in other areas of applied statistics as it accommodates three prevalent features of longitudinal problems: confounding from measured time-varying and unobserved timeinvariant sources and bi-directionality between treatment and response.
Some limitations of our approach should be noted. First, since the predicted RE cannot distinguish between time-invariant unobserved heterogeneity that arise from confounding or nonconfounding characteristics, their use in the PS model can potentially lead to more extensive conditioning than is strictly necessary for causal comparison. We have shown that while the inclusion of non-confounding characteristics in the PS model does not induce bias in the estimation of the dose-response, it does give potentially less efficient estimates and can render the task of finding overlap in the covariate distribution more challenging. Second, in contrast to IV methods, we require "ignorability" to hold given the covariates, and while our approach will help address unmeasured time-invariant confounding, it will not eliminate bias from unmeasured time-varying confounding. Finally, two limitations that could be addressed in future research relate to the need for appropriate test statistics for model fit that are applicable for multistep estimation procedures, and the issue of measurement error and the propagation of error through the model components.

SUPPLEMENTARY MATERIALS
The supplementary material provides detail on the algorithm used for dose-response estimation, proofs for the properties of the mixed model GPS, a description of the data used in our application, and some additional results. [Received April 2013. Revised July 2014