Effects of Urban Sprawl on Obesity

In this paper, we examine the effect of changes in population density--urban sprawl--between 1970 and 2000 on BMI and obesity of residents in metropolitan areas in the US. We address the possible endogeneity of population density by using a two-step instrumental variables approach. We exploit the plausibly exogenous variation in population density caused by the expansion of the U.S. Interstate Highway System, which largely followed the original 1947 plan for the Interstate Highway System. We find a negative association between population density and obesity and estimates are robust across a wide range of specifications. Estimates indicate that if the average metropolitan area had not experienced the decline in the proportion of population living in dense areas over the last 30 years, the rate of obesity would have been reduced by approximately 13%.


Introduction
Over the past forty years, the prevalence of obesity in the U.S. has increased dramatically.
Between the early 1960s and 2004, the percentage of U.S. adults who were obese more than doubled from 13.4% to 32.2% (Flegal at al., 2002;Ogden et al., 2006). This dramatic increase is disconcerting because obesity has been linked to type II diabetes, high blood pressure, high cholesterol, asthma and poor health outcomes (Kopelman, 2000;Koplan et al., 1999;Peeters et al., 2003;Wellman et al., 2002). Further, it has been estimated that obesity contributed to between 100,000 and 300,000 deaths in just one year-2000 (Mokdad et al., 2004 and2005;Flegal et al., 2005). Finally, obesity-related morbidity has been estimated to account for 9.1% of total annual U.S. medical expenditures in 1998 ($92.6 billion in 2002 dollars). 1 The spatial distribution of the population in the U.S. has also changed markedly over the same period. Between 1950 and 2000, the share of the population living in metropolitan areas grew from 56% to 80% (Transportation Research Board Special Report 282, 2005). While a greater proportion of the population is now living in urban areas broadly defined, all of the growth in metropolitan areas has occurred in suburban areas, as central cities actually declined in population (Baum-Snow, 2007). In 1950, the population of metropolitan areas was roughly evenly divided between the suburban fringe and the central city; currently, approximately twothirds of the population of metropolitan areas resides in the suburbs, and this proportion has been rising (Pisarski, 2001). Table I shows that between 1970 and 2000 the population weighted population density for 53 major metropolitan areas fell over 19% with more dramatic declines observed for the densest parts of metropolitan areas.
The growth in obesity and decline in population density observed over roughly the same periods have led researchers to investigate the association between these two trends more fully, and to assess whether the association is causal. Urban sprawl, characterized by low-density development patterns and changes in the built environment, has been found to be positively associated with rates of obesity, although the evidence is not uniform (Ewing, 2003;Frank et al., 2004;Lopez, 2004;Vandegrift and Yoked, 2004;Platinga and Bernell, 2007;Eid et al., 2008;Black and Macinko 2008;Moon 2009). 2 Several explanations of this association have been proposed. Suburban residential location increases the distance between home and destination (e.g., job), increases the reliance on automobiles, and minimizes walking. Lack of sidewalks and bicycle trails, and the cul-de-sac street layouts that are typical in suburban areas may decrease physical activity (Cervero et al, 1995;Handy, 1996;Hess et al., 1999;Crane et al., 1998;Boarnet et al., 2000;Saelens et al., 2003, Frank, 2000Berrigan et al., 2002). In addition, greater presence of large retail stores (i.e., big box stores) in the suburbs leads to lower costs of food in suburban areas due to economies of scale and relatively cheaper land prices. Lower food prices have been found to be correlated with dietary intake patterns (French, 2005;French et al. 1997French et al. , 2001Jeffery et al., 1994;Powell et al. 2007a;Beydoun et al., 2008;Larson et al., 2009). In addition, there is evidence that large food stores are more likely to carry healthy foods (e.g., fresh fruits and vegetables) than are smaller grocery stores (Sallis et al., 1986;Horowitz et al., 2004;Jetter et al., 2006;Chung et al., 1999).
Urban sprawl may also affect the weight of central city residents. Land in densely populated cities is not sufficient to accommodate parking and other structures of large grocery stores, which are more likely to carry fresh fruit and vegetables. Thus, access to healthy foods is 2 In this paper, urban sprawl and low population density are used interchangeably. Urban sprawl is generally defined as the low density development pattern which changes the built environment in which individuals reside. The built environment, compared with the natural environment, refers to the man-made surroundings that provide the setting for human activity including both physical and social elements. decreased in central cities. Instead, central cities are more likely to be characterized by small grocery stores and a higher density of fast-food outlets sometimes resulting in what has been referred to as "food deserts" in which poor urban residents cannot buy affordable, healthy food (Cummins and Macintyre 2002;Ignami et al., 2006;Feldstein, 2007;Ford et al., 2008). Finally, urban sprawl is associated with a concentration of poverty and higher crime rates in some parts of the central city, which makes outdoor activities more dangerous and may limit opportunities for physical activity (Lumeng et al., 2006;Stafford et al., 2007;Mujahid et al., 2008.) All of these changes are hypothesized to result in an increase in obesity.
Results from previous studies of the association between urban sprawl and obesity have led the Centers for Disease Control and Prevention (2003), the World Health Organization (2004) and others to advocate using community (re)design as a tool to curb obesity. 3 Although policy makers are calling for action to combat urban sprawl, a key policy question, which previous research has not adequately addressed, is to what extent urban sprawl causes people to be obese. While there have been several studies that have examined the associations between urban sprawl and obesity, these studies have been limited in their ability to provide estimates of a causal relationship. The primary limitation of previous studies is that the relationship between urban sprawl and obesity is likely to be confounded by non-random selection of residents into neighborhoods; factors that affect where a person live may also affect their weight. For example, those who live in the suburbs because of a preference for large houses that are facilitated by relatively cheaper land prices may also prefer to eat healthier, exercise and maintain a healthier weight. Moreover, preferences of these suburbanites will attract providers of healthy foods and cause local governments to provide amenities consistent with such preferences. Therefore, the observed positive association between urban sprawl and the built environment, and obesity may reflect selection rather than a causal relationship in which sprawl causes obesity. Identifying the causal effect is essential for designing effective public policy. If the association between urban sprawl and obesity is due to selection, then policies aimed at curbing urban sprawl would have little effect on obesity.
Four previous studies have attempted to address the selection issue (Eid et al., 2008;Plantinga and Bernell, 2007a;Plantinga and Bernell, 2007b;Ewing et al., 2006). Three of these studies used fixed-effect methods with longitudinal data on individuals from the National Longitudinal Survey of Youth (1979 and1997 cohorts). Findings from these studies were mixed; Plantinga and Bernell (2007a) found that sprawl lowered BMI, and Eid et al. (2008) and Ewing et al. (2006) found no statistically significant associations between sprawl and BMI. 4 While these studies arguably improve over cross-sectional analyses that ignore selection, there are some notable limitations. First, these studies use samples with narrow age ranges that limit the applicability of their findings. Eid et al (2008) used a sample of persons between the ages of 23 and 36; Ewing et al. (2006) used a sample of young adults ages 18 to 23; and Platinga and Bernell (2007a) used a sample of persons ages 33 to 41. Second, while fixed-effects methods eliminate time-invariant factors, there is the possibility that omitted time-varying factors will bias estimates. It is noteworthy that observed time-varying factors such as marital status and work have relatively large and statistically significant effects on BMI in the Eid et al.(2008) study. This raises the possibility that omitted time-varying factors may also be important confounders. Third, as noted above, urban sprawl may affect the weight of both central city residents and suburban residents, so studying the effects of moving from more to less dense places may obscure the effects of sprawl. Plantinga et al. (2007b) use instrumental variables to address selection. Instruments for their measure of urban sprawl are martial status and family size. The validity of instrument is questionable because it is likely that marital status and family size are correlated with BMI. For example, single people may be less likely to be obese because they are active in the marriage market (Helmchen 2004).
In this paper, we take a fundamentally different approach to the problem. We examine the effect of changes in population density-urban sprawl-between 1970 and 2000 on BMI and obesity of residents in metropolitan areas in the US. The long time period is an advantage because it incorporates large changes in population density and the built environment in both central cities and suburbs. It is these changes that are most directly tied to the debate over the use of community (re)design to combat obesity. We address the possible endogeneity of population density by using a two-step instrumental variables approach. We exploit the plausibly exogenous variation in population density (urban sprawl) caused by the expansion of the U.S. Interstate Highway System, which largely followed the original 1947 plan for the Interstate Highway System. We use the original 1947 plan for the Interstate Highway System to instrument for population density.
We find a negative association between population density and obesity and estimates are robust across a wide range of specifications. Estimates indicate that if the average metropolitan area had not experienced the decline in the proportion of population living in dense areas over the last 30 years, the rate of obesity would have been reduced by approximately 13%. Sub-group analyses show that those living in the central city seem to experience slightly greater changes in their weight status compared to their counterparts living in suburbs.

Research Design
The goal of the empirical analysis is to obtain estimates of the association between population density in a metropolitan area, which we use as a proxy for urban sprawl and the built environment, and the weight of residents in that area. Regression methods are used to obtain this association, and we use the following model specification: (1) In equation (1), the body mass index (BMI), or obesity (BMI>30), of person i in MSA j and year t depends on MSA fixed effects (α j ); year fixed effects (γ t ); individual characteristics (X) such as age, race, sex, marital status and education; MSA population density (PopDen); and timevarying MSA level characteristics ( jt Z ) such as median family income and employment rates.
Population density is measured as the proportion of the population living in dense areas where different thresholds are used to define dense areas to capture the degree to which populations are centered in high density living areas.
Ideally, we would like to interpret estimates of the association between population density and weight obtained from equation (1) as causal, and toward this goal, we have included controls for potentially confounding influences (e.g., MSA fixed effects and time-varying MSA characteristics). However, there still may be omitted MSA-level factors that affect population density and weight of residents. Therefore, we use a two-step instrumental variables approach.
Specifically, we use the original 1947 Interstate Highway System plan to predict population density and use this predicted measure of population density in equation (1). The equation used to predict population density is: (2) 2000 1990, 1980, 1970, t ( In equation (2), population density of MSA j in year t depends on MSA fixed effects (δ j ), a quadratic time trend (TIME and TIME 2 ), and the number of planned highway rays from the 1947 Interstate Highway System plan in MSA j (HWPLAN j ) multiplied by the number of years since 1947. Note that we use aggregate data for the four decennial census years to estimate equation (2), and that individual characteristics (X in equation 1) are not included in equation (2). 5 We are not using a two-stage least squares procedure, but a two-step instrumental variables procedure. 6 The underlying logic of this approach is that the highway plan originating in 1947 was to be implemented over time and therefore population density of an MSA would be affected in a diffused way over time. This diffusion process is captured by the specification of equation (2) by the term [(Year t -1947)*HWPLAN j ], which allows the effect of the highway plan to vary over time in a linear fashion. Other specifications (e.g., quadratic) of the diffusion process were tried and statistical tests could not reject the linear specification, nor did estimates of interest differ in these alternative specifications.
Models similar to equation (2) have been estimated by others. Specifically, Baum-Snow (2007) showed that the greater the number of planned highways in a metropolitan area, the 5 The purpose of equation 91) is to derive a predicted measure of population density for each year between 1976 and 2001. Therefore, we use a quadratic time trend instead of dummy variables for each of the four Census years. 6 To assess whether estimates are sensitive to the method of estimation, we also obtained estimates from a standard two stage least squares procedure for the matched sample where data are available. Estimates from this analysis are very similar to those obtained using the two-step (not 2SLS) instrumental variables procedure. We present both set f results. greater the loss of central city population over the last half of the 20 th century. This paper finds similar effects with respect to population density, which are presented below. Thus, in assessing the validity of the instrumental variables approach, we have a sufficiently strong first stage association. The interstate highway system is associated with a significant change in the spatial distribution of the population and related development patterns.
The next question related to the validity of the instrumental variables approach is whether the use of the 1947 plan meets the exclusion criterion. We believe it is reasonable to think it does. The original Interstate Highway System plan of 1947 was motivated by concerns related to national defense, was designed to connect faraway places and not intended to facilitate local commuting. The Federal-Aid Highway act of 1944 called for designation of a National System of Interstate Highways, to include up to 40,000 miles "…so located as to connect by routes, as direct as practicable, the principal metropolitan areas, cities, and industrial centers, to serve the national defense, and to connect at suitable border points with routes of continental importance in the Dominion of Canada and the Republic of Mexico." On August 2, 1947, Commissioner MacDonald and Federal Works Administrator Philip B. Fleming announced the selection of the first 37,700 miles. The routes had been proposed by state highway agencies and reviewed by the Department of Defense to meet the needs of national defense. Therefore, it is plausible that, conditional on controlling for MSA fixed effects (initial conditions that may have influenced 1947 plan), the original planned number of highways is exogenous-uncorrelated with unmeasured determinants of changes in obesity within an MSA. Some evidence that this is in fact the case is that the inclusion of (relevant) time-varying MSA characteristics has little effect on instrumental variables estimates. Nevertheless, we recognize that it is difficult to obtain valid instrumental variables estimates and that our findings need to be interpreted in this light.

NHIS
The NHIS for the years 1976 to 2001 was used to obtain demographic information, weight, height and metropolitan area of residence. 7 We limited the sample to persons age 18 and over. The NHIS provided the following information: age; race (four race/ethnicity categories: non-Hispanic white, non-Hispanic black, Hispanic and others); sex; education (five education categories describing the highest grade individual completed: elementary school, some high school, high school graduate, some college and college graduate); income (0-4,999, 5-9,999, 10-14,999, 15-24,999, 25,000 or more, and missing); marital status (four marital status categories: single, married, separated/divorced, and widowed); and metropolitan area of residence.
Definitions, means, and standard deviations of all variables employed in the NHIS are reported in Appendix Table I. 7 We used 1976 as the first year of the sample because 1976 is the first year NHIS has individual's weight and height information in the core data set. We used 2001 as the ending year because the construction of the interstate highway was almost finished by 2001 and it is the last year that NHIS public use data have MSA identifiers.
The metropolitan area is the lowest geographic identifier available in the public-use NHIS data. 8 There are two points worth mentioning regarding the metropolitan area identifier. First, the NHIS only identifies large metropolitan areas and the number of metropolitan areas identified in the NHIS increased approximately every ten years. 9  Because of historical changes in geographic definitions, caution must be taken in comparing data for these statistical areas from different years. For example, most metropolitan areas encompass less territory during earlier years than in later years, and those newly included areas are generally less densely populated than those already included areas. If, for some 8 Respondents who either live in non-MSA areas or not self-representing MSA areas are excluded from the estimation sample due to lack of their geographical location information. 9 Metropolitan areas are defined by the U.S. Office of Management and Budget and definitions change approximately every ten years basedon Census data. The general concept of a metropolitan area is that of a core area containing a substantial population nucleus, together with adjacent communities having a high degree of economic and social integration with that core. Changes in definitions of metropolitan areas since the 1950 Census have chiefly resulted from the recognition of new metropolitan areas when the requirement on population was reached; the addition of counties to existing metropolitan areas; transfer of counties from one area to another; and dropping of counties from an area due to changes in population or the economic or social tie to the central counties of metropolitan areas. The large majority of changes have taken place on the basis of Decennial Census data. 10 Those metropolitan areas which are only identified for two years (1995,1996)  unobserved reasons, people living in less dense areas are more or less likely to be obese, changing definition alone would lead to a non-zero association between population density and obesity.
To address this issue, we include a MSA fixed effect for each unique definition of an area. Thus, equation (1) The important aspect of equation (1`

Census Data
We used data from U.S. Censuses to calculate population density in 1970, 1980, 1990, and 2000 for each of the 122 unique MSA definitions; for example, we calculated the population density in each of the four decennial census years for each of the three geographical definitions of the Atlanta MSA. Thus, there were 12 measures of population density related to Atlanta. To construct these figures, we used constant-geography (boundary) census tract data on population that were obtained from the Neighborhood Changing Data Base (NCDB). We aggregated these census tract population data to the county level because MSAs consist of county groups and county definitions rarely change. 11 Then, we created county-MSA cross walk file and aggregated the county level data to the MSA level using the historical metropolitan area definitions.
Population density of the MSA was measured as the proportion of the population living in census tracts with a specified threshold of population density, for example, the proportion of persons living in census tracts with a population density of 5,000 or more persons per square mile. Previous studies have used different population density thresholds to measure the degree to which population is centered in high density living areas (Lopez, 2004;Ewing, 2003). We use several thresholds to define dense areas 12 : 5,000 or more people per square mile (about 50% of census tracts in the constant-geography MSAs have population density less than this), 9,000 or more people per square mile (about 75% of census tracts in the constant-geography MSAs have population density less than this), and 12,500 or more people per square mile, which is the lower limit of density needed to support mass transit (Ewing, 2003). By aggregating high density census tracts population within each MSA and dividing them by the total population living in the MSA, one obtains the proportion of the population living in dense areas for each MSA.
The changing geographical definitions of MSAs in the NHIS and the corresponding implications it has for measuring population density requires modification of equation (2) to 11 No county was consolidated and relatively few counties broken off. Few counties with changing boundaries were restored to form their original counties when matching counties to MSAs.

Current Population Survey
The CPS March file was used to calculate (weighted) MSA level median family income and employment rates in an MSA. Definitions of MSA identifiers in CPS also changed over time based on OMB's definitions. CPS data was merged to the appropriate MSA definition.

Results
The first estimates we discuss are from equation (2`), which is used to predict population density using the 1947 highway plan. Table II presents these estimates. Estimates of the effect of the 1947 highway plan are negative and statistically significant. More planned highways in 1947 are significantly associated with a decrease in population density over time. In terms of magnitudes, an additional highway ray is associated with a 5% decrease every 20 years in the proportion of the population in an MSA that lives in census tracts with 5000 or more people per square mile. Larger effects are found for measures of greater population density; an additional highway ray is associated with a 9% decrease in the proportion of the population in an MSA that lives in census tracts with 9,000 or more people, and an additional highway ray is associated with a 10% decrease in the proportion of the population in an MSA that lives in census tracts with 12,500 or more people.
To assess whether estimates are sensitive to the inclusion of time-varying MSA characteristics, we estimated models with and without controls for median family income and employment rates of the MSA. Columns (3), (5) and (7) of Table II show present these estimates. The inclusion of these controls has little effect on estimates of the effect of planned highway rays even though median family income has significant and positive effect on population density. These results are evidence in support of the validity of the instrumental variables approach because it does not appear to be the case that the instrument is correlated with observed time-varying MSA characteristics.
We now turn the discussion to the estimates of the effect of population density on obesity and BMI. Table III presents the estimates for obesity and Table IV estimates for BMI. Estimates are obtained from the two-step, instrumental variables procedure 14 . Standard errors have been constructed using methods that account for the predicted nature of population density and the potential non-independence (clustering) of observations within MSA-year (Murphy and Topel, 1985;Hardin 2002;Hardin et al 2003). Estimates in Table III are statistically significant at the 10% level and indicate that for each additional percentage point decrease in the proportion of population living in dense areas, obesity is approximately 0.1 to 0.2 percentage points higher.
Estimates are larger for higher density thresholds. For example, estimates associated with a density of 12,500 are twice as large as estimates associated with a density of 5,000. Notably, estimates are not sensitive to the inclusion of time-varying MSA controls for median family income and employment rates. This provides some evidence supporting the identification assumption underlying the instrumental variables approach, which is that changes in population density caused by the 1947 highway plan are uncorrelated with changes in other attributes of the MSA that are correlated with changes in weight status. Table IV presents estimates of the association between population density and BMI.
Estimates indicate that a one percentage point decrease in the proportion of population living in dense areas increases BMI by about 0.01 units, but parameter estimates are not statistically significantly at conventional levels. Interestingly, comparing the effects on BMI and obesity 14 OLS (non-IV) regression methods are not possible because there are only three Census years that provide population density for each MSA, but there are 26 years in NHIS data. Below, present a model that uses only the three Census years: 1980, 1990, 2000. suggests that instead of shifting the whole weight distribution to the right, the decline in population density has a larger effect on the upper tail of the weight distribution. This is notable because previous studies that have addressed the selection problem have not used obesity as an outcome.
It is worth mentioning that coefficients on individual characteristics generally have the expected signs and are consistent with findings in the previous literature (estimates are not presented). Age has an inverted U-shaped effect on the probability of being obese and BMI.
Black and Hispanic persons are more likely to be obese and have higher values of BMI than whites, while persons of other races are less likely to be obese and have lower values of BMI than whites. Men are less likely than women to be obese, but have a higher average BMI than women. Compared to single (never married) individuals, married and divorced individuals are less likely to be obese and widowed persons are more likely to be obese. Years of formal schooling completed has a negative effect on the probability of being obese and BMI.

Two-stage Least Squares Estimates
To investigate whether estimates are sensitive to the method of estimation, we obtained standard two-stage least squares estimates. To conduct this analysis, we used data from the NHIS in only the census years of 1980, 1990, and 2000. For these years, census data can be directly matched to the individual-level data from the NHIS. Summary statistics for this "matched" NHIS sample are found in Appendix Table I. The matched sample's summary statistics are very similar to those of the full sample. We estimate two models using this matched sample. First, we used OLS to obtain estimates of the association between the actual proportion of the population living in dense areas and obesity (BMI). In this case, we ignore selection. We then obtain estimates of this association using the conventional two-stage least squares approach in which planned highway rays are the excluded instrument.
The two-stage least-squares model is illustrated below: First stage (equation 4) estimates for the two-stage least squares procedure and the twostep IV procedure are reported in Table V. Table VI presents OLS, the two-stage least squares and the two-step IV estimates of equation (3). Estimates in Table V and VI obtained by the twostep IV procedure are virtually identical to those obtained using the standard 2SLS procedure.
This provides evidence that the two-step IV model adopted in the full sample produces estimates similar to what would be obtained from a conventional 2SLS model. Table VI is that the 2SLS (two-step) estimates of the effect of population density on obesity exceed the OLS estimates. If the only source of selection is that individuals who are more likely to be obsess due to unmeasured factors are also more likely to choose to live in less dense areas, then the 2SLS approach would produce smaller estimates than those of OLS in absolute value. We find the opposite. This finding is consistent with the existence of an additional bias, for example, low density areas are the types of areas for which unobservable factors would tend to produce thinner people. One potential explanation is the heterogeneous demand for locally provided public goods. For example, if thinner people who invested more in their health also put more emphasis on their children's human capital investment, they may choose to reside in suburban areas in search of better public schools (Mieszkowski et al., 1993;Wassmer et al, 2005). Estimates of the associations between population density and weight for central city and suburban residents are presented in Table VII. Focusing first on the results for obesity, coefficients on the interaction term between population density and central city residence indicate that city dwellers experience slightly greater changes in their weight status as population density changes, implying approximately 0.01 to 0.02 percentage point difference for each additional percentage point change in the proportion of population living in dense areas. In addition, the central city-suburban differences are smaller both in magnitude and significance for higher density thresholds. The estimate associated with a density of 12,500 are half the size of estimate associated with a density of 5,000 and is not significant at conventional levels, yet the weight status change for people living in central cities is still statistical significant. Considering the fact that the effect of population density on obesity is larger for higher density thresholds, city dwellers are affected more when the extent of the shifts of population to less dense area are broader. Turning to the results for BMI, estimates in Table VII show a pattern similar to that of obesity. City dwellers experience slightly greater changes in their BMI than suburbanites, though effects on BMI for both city dwellers and suburbanites are not significantly different from zero.

Population Density Distribution Analysis
The measures of population density in the preceding analyses are based on using different thresholds to define high density areas. In order to provide more detailed information about the effects of the population density distribution on individuals' weight, we re-estimate the model using five mutually exclusive categories of population density: 1) the proportion of population living in areas with density smaller than 3,500 people per square mile; 2) the proportion of population living in areas with density of 3,500 to 5,000 people per square mile; 3) the proportion of population living in areas with density of 5,000 to 9,000 people per square mile; 4) the proportion of population living in areas with density of 9,000 to 12,500 people per square mile; and 5) the proportion of population living in areas with density greater than 12,500 people per square mile.
The use of five categories to characterize MSA population density requires that we estimate the first stage model using multinomial logit methods for grouped data. In this model, the dependant variable is the proportion of the MSA population in each category of the five population density categories. To make things clear, we show the specifications used in this analysis: (5) 2000 1990, 1980, 1970, t In equation (5), g is an index of the five mutually exclusive categories of population density. We use equation (5) to predict the proportion of the MSA that resides in each of these population density categories and merge these predicted measures of population density to the NHIS data. Note that there are now four estimates of the association between population density and weight.
The least dense area is the reference group.
Table VIII presents the results from estimating equation (5). The estimated coefficients on planned highway rays show the expected pattern with the largest decline of population shares occurring in areas with the highest density, which is consistent with previous analyses using threshold based population density measures and the predictions of the land use theory.
Regression estimates indicate that, conditional on control variables, each additional planned highway ray will decrease the proportion of population living in areas with density of more than 12,500 people per square mile and 9,000 to 12,500 people per square mile by 1 percentage point and 0.6 percentage points respectively every 20 years. Moreover, each additional planned highway rays actually increase the proportion of population living in areas with density less than 3,500 people per square mile and 3,500 to 5,000 people per square by 1.17 percentage points and 0.54 percentage points respectively every 20 years.
We now turn to estimates of the effect of predicted population density distribution on BMI and obesity. Table IX presents these estimates and show that, compared to the proportion of people living in the low density areas (areas with density of less than 3,500 people per square mile), decreases in the proportion of people living in the high density areas (areas with density of more than 12,500 people per square mile) increases obesity rates by approximately 0.24 percentage points and the effect is significant at the 5% level. Decreases in the proportion of people living in areas with density of 5,000-9,000 and 9,000-12,500 people per square mile are also negatively associated with obesity, though the effects are not significantly different from zero. However, we cannot reject the null hypothesis that the effects of these three population density categories (more than 12,500 people per square mile, 9,000-12,500 and 5,000-9,000 people per square mile) on obesity are different from each other.. Interestingly, increases in the proportion of people living in areas with a density of 3,500-5,000 per square mile are positively associated with obesity, indicating 1 percentage point increasing in obesity rates though the effects are not statistically significant. This pattern is consistent with the possibilities described earlier, that urban sprawl can affect the obesity of both central city and suburban residents.
Turning the discussion to estimates of the effects of the population density distribution on BMI, estimates in Table IX indicate that, compared to the proportion of population living in low density areas (areas with density of less than 3,500 people per square mile), decreases in population density are negatively associated with BMI across all four population density categories. A one percentage point decrease in the proportion of population living in areas with density of 9,000-12,500 is associated with a 0.08 increase in BMI. Compared to the mean of BMI, the magnitude of the effect is quite small, though it is statistically significant. Changes in other population density categories have no statistically significant effects on BMI.

Conclusion
Previous research has documented a positive association between obesity and urban sprawl. Whether this association represented a causal relationship, however, has not been addressed. In this paper, we address the causality issue by using a plausibly exogenous source of urban sprawl-the decline in population density caused by the Interstate Highway System plan.
Estimates indicate that a one percentage point decrease in the share of population living in dense areas increased the prevalence of obesity by 0.1 to 0.2 percentage points depending on which threshold of dense area is used. Further analyses that examined how changes in the entire population distribution of the MSA affects obesity show that loss of population in the central city, as well as growth in the suburbs increases obesity.
To place these results in context, we evaluate the importance of urban sprawl in explaining the rising trend in obesity by examining the counterfactual obesity trend if urban sprawl did not exist. On average, according to the NHIS sample used in this paper, obesity rates increased by 145% from about 8.3 percent to 20.3 percent between 1976 and 2001. We take the predicted values of population density from the first stage to determine the "exogenous urban sprawl" by differencing population density between 2001 and 1976 resulting from 4 rays of planned highways (4 rays are the mean of the planned highway rays). The resulting number is about 10 percentage points. We then multiply this number by 0.0015, the coefficient of the population density with the threshold of 9,000 people per square mile, to derive the average percentage point difference in obesity that can be attributed to the exogenous change in population density. We divide the estimated difference in obesity rates due to exogenous urban sprawl (1.5 percentage points) with the observed difference in obesity rates between 2001 and 1976 (12 percentage points). The estimates thus indicate that about 13% of the increases in the obesity rate can be attributed to urban sprawl.
Overall, the results of this study suggest that urban sprawl did cause an increase in obesity, but its effect was relatively modest. Thus, policy makers may want to look elsewhere for solutions to the obesity problem, particularly if urban and community redesign are costly. While this study has identified a relationship between population density and obesity, it did not identify the underlying mechanisms that link urban sprawl to weight. Future research is warranted to better understand the mechanisms through which urban sprawl has caused the changes in obesity rates documented in this paper.    Notes: In all the specifications we control for individual's demographic information: dummy variables for age, race (non-Hispanic Black, Hispanic and others, non-Hispanic White is the reference group), sex, income, education (some high school, high school graduate, some college and college graduate, elementary school is the reference group) and marital status (married, separated/divorced, or widowed, single is the reference group). Table 2 shows the model used to create the predicted population density. Standard errors are corrected using Murphy-Topel estimate of variance-covariance. Standard errors are clustered at MSA/year. The regressions are weighted with NHIS sampling weights. * indicates significant at 10% level, ** indicates significant at 5% level, *** indicated significant at 1% level.  ), sex, income, education (some high school, high school graduate, some college and college graduate, elementary school is the reference group) and marital status (married, separated/divorced, or widowed, single is the reference group). * indicates significant at 10% level, ** indicates significant at 5% level, *** indicated significant at 1% level.  Table 5 shows the model used to create the predicted population density using the conventional 2SLS approach and the two-step IV procedure. The standard errors for the two-step estimation are corrected using Murphy-Topel estimate of variance-covariance. The regressions are weighted with NHIS sampling weights. * indicates significant at 10% level, ** indicates significant at 5% level, *** indicated significant at 1% level.  Notes: In all the specifications we control for individual's demographic information. Standard errors are corrected using Murphy-Topel estimate of variance-covariance. Standard errors are clustered at MSA/year. * indicates significant at 10% level, ** indicates significant at 5% level, *** indicated significant at 1% level.  Notes: In all the specifications we control for individual's demographic information. Bootstrapped standard errors clustered at MSA/year are reported in parentheses. * indicates significant at 10% level, ** indicates significant at 5% level, *** indicated significant at 1% level. Appendix