Does the urban wage premium differ by pre-employment status?

ABSTRACT This paper investigates whether the density of local labour markets in Germany impacts on the wage of new employment relationships and whether corresponding urbanization economies differ significantly across distinct types of transitions to employment. The results suggest rather small static urbanization benefits. Doubling employment density increases the wage of new employment relationships by 1.0–2.6%. Moreover, benefits seem to accrue only to persons experiencing job-to-job transitions and the short-term unemployed, but not to the long-term unemployed. It is supposed these differences point to matching advantages in large urban labour markets from which only some job seekers benefit.


INTRODUCTION
A voluminous literature provides robust evidence of an urban wage premium. In the urban economics literature, these wage disparities are explained by agglomeration economies. The density of the local economy might impact productivity and therefore wages in different ways. Duranton and Puga (2004) distinguish three basic mechanisms that might cause a positive correlation between density and productivity: sharing, matching and learning. While there is comprehensive empirical evidence of a positive impact of agglomeration on worker and firm productivity (Combes, Duranton, & Gobillon, 2008;Glaeser & Maré, 2001), much less is known about the significance of different mechanisms, as noted by Rosenthal and Strange (2004) and Combes and Gobillon (2015). Some studies explicitly differentiate between static and dynamic effects of agglomeration (e.g., Glaeser & Maré, 2001;Lehmer & Möller, 2010;Wheeler, 2006;Yankow, 2006) and also allow for heterogeneous effects across individual characteristics, in particular skill levels (e.g., Andersson, Klaesson, & Larsson, 2014;Bacolod, Blum, & Strange, 2009;Carlsen, Rattso, & Stokke, 2016;De la Roca & Puga, 2017;Matano & Naticchioni, 2016).
This study provides new empirical evidence on the heterogeneity of the benefits of dense urban labour markets. We investigate whether static urbanization effects on wages of new employment relationships differ depending on the pre-employment status of the workers. Using detailed information on more than 1 million transitions to full-time employment in Germany, we are able to distinguish among job-to-job transitions and transitions from short-and long-term non-employment. This is in contrast to previous studies that consider heterogeneous effects, but mainly focus on differences across skill levels of workers and tasks. We focus on static urbanization economies that immediately arise when workers and jobs are matched. Corresponding effects might, for instance, be traced back to a higher quality of matches between job seekers and vacancies in large urban labour markets.
We observe that the wage gap between the former longterm unemployed and workers with job-to-job transition increases with the employment density of local labour markets in Germany (see Figure A1 in the supplemental data © 2019 Regional Studies Association CONTACT a (Corresponding author) silke.hamann2@iab.de IAB Baden-Württemberg, Regional Research Network of the Institute for Employment Research, Institute for Employment Research, Stuttgart, Germany. b Annekatrin.Niebuhr@iab.de IAB Nord, Regional Research Network of the Institute for Employment Research, Institute for Employment Research, Kiel, Germany; and Empirical Labour Economics and Spatial Econometrics, Department of Economics, Christian-Albrechts-Universität zu Kiel, Kiel, Germany. online). Doubling the employment density enlarges the corresponding wage gap by 3%. This suggests that the benefits of large labour markets might differ with respect to the pre-employment status of the worker. Combining a model of matching in urban labour markets with the depreciation of human capital during periods of unemployment provides a theoretical mechanism that explains such differences. Görlich and de Grip (2009) argue that not using or not updating skills during periods of non-employment may result in their significant decline because they may be subject to technical and economic obsolescence. In particular, specific skills might deteriorate, thereby diminishing the urban matching advantage and the corresponding wage premium for the long-term unemployed. Moreover, the influence of referrals on labour market outcomes might result in heterogeneous benefits from agglomeration. Brown, Setren, and Topa (2015) show that referred candidates experience an initial wage advantage compared with non-referred. Spatial proximity in urban labour markets is supposed to facilitate interaction in social networks and, therefore, referrals might be one channel of static matching benefits. However, the long-term unemployed may not profit from referrals by co-workers because they are at the margin of the labour market.
Our regression results indicate, on average, rather small positive urbanization effects on wages associated with transitions from job search to employment. The estimates suggest that doubling employment density increases the wages of new employment relationships immediately by 1.0-2.6%. This confirms the findings of comparable studies that do not focus on entry wages (see the review by Combes & Gobillon, 2015). Moreover, our findings indicate, in fact, that the density effects are heterogeneous. The static benefits of large urban labour markets accrue only to persons experiencing job-to-job transitions and the short-term unemployed. We do not detect important effects on transitions from long-term unemployment. We suppose that these differences point to matching advantages in large urban labour markets from which only some job seekers benefit.
The paper is structured as follows. The next section reviews the literature with a focus on studies that consider static agglomeration benefits and the heterogeneity of agglomeration effects. The third and fourth sections describe the empirical strategy and the data. We discuss the main results of the regression analysis in the fifth section. The sixth section concludes.

THEORETICAL ARGUMENTS AND EMPIRICAL LITERATURE
Many studies find evidence of an urban wage premium (Combes & Gobillon, 2015;Melo, Graham, & Noland, 2009). The urban economics literature traces the higher wages back to productivity advantages of urban regions which were already discussed by List (1838), Roscher (1878) and Marshall (1890). Duranton and Puga (2004) combine various theoretical explanations into three main channels, that is, sharing, learning and matching, and provide micro-foundations for distinct mechanisms that generate agglomeration benefits.
With respect to matching, Kim (1990) shows that in large urban regions the proximity of workers and firms promotes specialized labour markets, inter alia, as workers tend to invest more in human capital depth rather than breadth (Kim, 1989). Specialization in turn gives rise to a reduction of the average cost of mismatch between the skills of workers and the requirements of firms. The theoretical models also suggest that workers who experience significant human capital depreciation due to extensive periods of non-employment may not benefit from market size because their specific skills deteriorate. Thus, we might expect significant differences in agglomeration benefits across matches that differ with respect to the preceding length of non-employment.
An important distinction exists between static and dynamic agglomeration effects. For instance, benefits of learning are considered to be dynamic as they increase with the time spent in agglomerations and give rise to wage growth effects. In contrast, static gains which might arise from better matching are instantaneous and associated with wage level effects (Combes, Duranton, Gobillon, & Roux, 2010).
A large body of the empirical literature focuses on the effects of agglomeration on productivity and wages, respectively. The seminal contribution of Ciccone and Hall (1996) addresses static effects of agglomeration on productivity. Using aggregate data for the United States, they find that doubling the employment density increases labour productivity by approximately 6%. A drawback of studies based on aggregate data is, however, that they cannot control for the effects of worker sorting across locations. Glaeser and Maré (2001) deal with this problem by means of individual fixed effects and differentiate between static and dynamic agglomeration economies. If the sorting of workers (and also reverse causality) is addressed, the estimated elasticity of wages with respect to city size typically ranges from 0.01 to 0.03 (Combes & Gobillon, 2015). 1 Some recent studies also consider heterogeneous agglomeration effects, in particular, with respect to the skill level of workers. In some of these studies the importance of different mechanisms that may generate agglomeration benefits is investigated as well. Results by De la Roca and Puga (2017) suggest that especially workers with high (unobserved) ability benefit from static and dynamic agglomeration economies. The heterogeneity of static and dynamic agglomeration gains is confirmed for cities in Sweden by Andersson et al. (2014). They show that workers who do mainly non-routine tasks benefit more than workers with a high fraction of routine tasks. In contrast, Korpi and Clark (2017) find similar wage premiums for differently skilled workers in the three largest metropolitan areas in Sweden. Carlsen et al. (2016) detect that for primary-educated workers the city wage premium in Norway is due to learning effects, while college-educated workers benefit from shifting between firms. Likewise, the results of Matano and Naticchioni (2016) indicate that (unskilled) workers at the bottom of the wage distribution in Italy only experience significant wage growth effects due to learning in dense labour markets, while for skilled workers at the top of the wage distribution, the wage premium results primarily from better matching opportunities.
The studies on the relationship between wages and labour market size focus on wages of existing employment relationships, rather than on entry wages. As regards new employment relationships, the focus of the literature is rather on the effects of market size on the probability of nonemployed workers to find a job (Di Addario, 2011), the frequency of job changes (Wheeler, 2008;Yankow, 2006), as well as industry and occupation changes of young workers (Bleakley & Lin, 2012;Finney & Kohlhase, 2008, for the United States; Andersson & Thulin, 2013, for Sweden).
Whether (long-term) unemployed workers benefit to the same extent from urbanization effects as do employed workers with regard to wages, has not been considered in the empirical literature so far. Since the gap in entry wages between job-to-job transitions and transitions from long-term non-employment increases with region size (see Figure A1 in Appendix A in the supplemental data online), we might expect that the impact of agglomeration significantly differs across these groups of workers. Mincer and Ofek (1982) show that career interruptions due to unemployment, sick leave or other reasons cause a significant decline in wages which is interpreted as evidence of human capital depreciation (see also Görlich & de Grip, 2009). Bacolod et al. (2009) argue that skilled workers in urban regions gain more from both learning externalities and matching advantages due to specialized skills. Therefore, we expect that workers benefit less from static urbanization effects after a significant period of unemployment.

DATA
To determine the impact of labour market density on entry wages and its heterogeneity with respect to the preemployment status of the workers, we analyze wages in 1,005,316 new employment relationships in Germany between 2005 and 2011. Detailed information on individual labour market biographies and the focus on entry wages enables us to differentiate among types of transitions, that is, job-to-job transitions as well as transitions from shortand long-term non-employment.
The information is drawn from the Integrated Employment Biographies (IEB) of the Institute for Employment Research (IAB), which contain detailed administrative micro-data on employment, job-search status, benefit receipt and participation in active labour market policy measures. We use a 5% random sample of the universe of all employees in Germany with at least one social security notification between 2005 and 2011 and investigate their biographies for the period 2000-11. Using individual employment spells, we are able to identify new employment relationships. A detailed data description is provided in the supplemental data online.
For new employment relationships we observe the gross daily wage and particulars, such as occupation and occupational status, and important worker characteristics, such as age, educational attainment and sex. We use the labour market biographies to generate additional control variables, for example, labour market status before the considered transition to employment, recent (occupationspecific) labour market experience, and the number of different previous employers; see Tables A1-A3 in Appendix A in the supplemental data online for details.
The establishment identifier in the IEB is used to match information on the establishment from the IAB's Establishment History Panel (BHP) 2 with the individual-level data set, for example, industry, establishment size and skill structure of the staff. We use a region identifier to assign each transition to employment to one of 141 German regional labour markets 3 and to enrich our individual data with detailed information on the regions. The raw elasticity of the average entry wage with respect to our pivotal variable, the local employment density, is approximately 0.11 if systematic differences in the wage level between East and West Germany are taken into account (see Figure A2 in the supplemental data online). Labour market density explains more than 30% of the variation in regional wages in a simple model where density is the only regressor.

EMPIRICAL STRATEGY
We apply the two-stage regression approach proposed by Combes et al. (2008) to estimate the impact of employment density on the wages of new employment relationships. We estimate the model using all transitions to a new job and separately for three distinct types of transitions. In order to analyze whether the (long-term) unemployed benefit to the same extent from urbanization effects as do workers with job-to-job transitions, we consider a differentiation by the length of non-employment before the match. 4 This is in contrast to, for example, Bacolod et al. (2009) and Andersson et al. (2014) who investigate heterogeneity with respect to skills and tasks, respectively. Our focus seems more consistent with the specificity of worker skills and the requirements of jobs, as discussed by Kim (1990), and might also provide information on the importance of matching advantages in large urban labour markets. We assume that the specificity of skills declines as the length of non-employment before transition increases because human capital depreciates.
In the first stage, we regress individual entry wages on a set of region fixed effects while controlling for worker, job, firm and region characteristics (see equation 1). In the second stage, we regress the region fixed effects on employment density (equation 2). 5 This gives the elasticity of entry wage with respect to the size of the regional labour market. The first-stage wage regression is given by: where w irst is the log entry wage of worker i in region r, sector s and year t. The vector x it captures time-varying worker characteristics; a i is a worker fixed effect; and 1 irst is the error term. Individual characteristics include detailed information from the labour market biographies, pre-employment status and participation in active labour market policy programmes.
Does the urban wage premium differ by pre-employment status?
Workers are likely to accumulate firm-specific human capital, for instance, because employers may offer training and workers can acquire skills via learning-by-doing. As tenure increases, these factors will gain importance for productivity and wages. Possibly, there are also regional differences in the supply of further education and training. However, usually these determinants of wages are unobserved by the econometrician. By focusing on new employment relationships and applying fixed effects models, we reduce the risk of biased estimates and control for unobserved factors.
Apart from worker characteristics, we control for firm characteristics such as sector, firm size and skill structure. The region fixed effect u r captures the impact of observed and unobserved regional factors on worker productivity and is the dependent variable of the second-stage regression. As a robustness check we also consider an approach with region-time fixed effects (u rt ). This approach ensures that the identification of the agglomeration effects rests on both mobile and immobile workers, while with (timeinvariant) region fixed effects the estimation is based on mobile workers only (e.g., Combes, Duranton, & Gobillon, 2011). u rk(i)t denotes the regional unemployment rate differentiated by skill level k(i); and z rst refers to local characteristics of the sector in which the new employment relationship is established. 6 Thus, in some first-stage regressions, we include variables that are supposed to capture specific agglomeration effects, such as employment share and number of establishments in the local industry in order to isolate the urbanization effect from localization effects.
Even the model with worker fixed effects might suffer from omitted variable bias if learning effects, that is, dynamic benefits due to work experience in dense labour market, are ignored (De la Roca & Puga, 2017). To address this issue, we include work experience acquired in agglomerations as a control variable. 7 Human capital externalities and complementarities between skill groups are captured by the human capital of the local industry and the qualification structure of the firm's workforce. As Wheeler (2006) shows that job changes positively impact wage growth, the number of job changes over the last five years is also included. Moreover, we account for the effect of the regional industrial structure on entry wages. The different specifications, that is, with and without specific agglomeration controls on the first stage, should provide some information on the potential range of static urbanization effects and matching advantages in the second-stage regression.
On the second stage, we regress the estimated region fixed effectsû r on the employment density and some control variables:û where D r is the log employment density; and e r is an error term. Time fixed effects w t are included in equation (2) if we consider estimated region-time fixed effectsû rt as the dependent variable. Our main interest is to provide an unbiased estimate of l, the elasticity of entry wages with respect to labour market size. Control variables C r in the second stage allow for the impact of amenities that may be capitalized into wages (Combes et al., 2008). We also account for systematic differences between East and West German labour markets (see Figure A2 in Appendix A in the supplemental data online). Agglomeration economies that might spill over the boundaries of regions are captured by a spatial lag of D r that has been computed using a row-standardized contiguity weight matrix.

IDENTIFICATION
There are two important econometric issues: selection effects and endogeneity. We will discuss these problems very briefly. A more comprehensive discussion is provided in the supplemental data online referring to Combes et al. (2008Combes et al. ( , 2010Combes et al. ( , 2011 and Combes and Gobillon (2015). First, the estimated elasticity might be upward biased due to unobserved heterogeneity, that is, more able workers might select into large regions. We apply the standard solution and include worker fixed effects in the regression models. Furthermore, by considering wages associated with new employment relationships, we focus on mechanisms that have instantaneous effects on productivity unlike other channels, such as learning, that take some time to materialize. 8 As tenure increases, other factors, for example, onthe-job and professional development training offered by the firm, will gain importance for productivity. Normally, these effects are unobserved by the econometrician. Hence, by using entry wages, we prevent that (unobserved) learning effects associated with a specific job impact on the wage for the specific job.
In addition, we include information on the employment status before the new job which likely impact entry wages as well. However, we cannot entirely rule out that there are other unobserved time-varying factors that are correlated with the error term in equation (1) and will bias the estimation of the region fixed effects.
Endogeneity might also arise because large regions characterized by high wages will be attractive locations and are thus likely to experience significant in-migration. This will in turn impact the size of the labour market. Therefore, we need to account for reverse causality. To identify a causal effect, we apply instrument variable estimation techniques. A prevalent approach in the literature on agglomeration effects involves the use of historical population density and soil characteristics to instrument for labour market density (e.g., Ciccone & Hall, 1996;Combes et al., 2008). We use historic population density provided by Rothenbacher (2002) and soil characteristics from the European Soil Database as suggested by Combes et al. (2010). Detailed information is given in the supplemental material online. 9 The instrument variables M rt are valid if they are relevant [COV (M rt , D rt )] = 0 and exogenous, that is, uncorrelated with the error term [COV (M rt , e rt )] = 0. Historical population data (primarily for 1900) are considered to be relevant due to the persistence of the spatial 1438 Silke Hamann et al.
distribution of population and economic activity caused by the locations of housing stock and production sites. Soil characteristics are supposed to be significant because they are important determinants of early agricultural production and, therefore, fundamental drivers of population settlements (Combes et al., 2010). As regards exogeneity of the instruments, we should avoid any simultaneity bias caused by contemporaneous local shocks via using long lags of population density. Following Combes et al. (2010), we assume that contemporaneous determinants of local wages are not related to the factors behind historical agglomeration patterns if we control for some first-nature characteristics of the regions in the second stage. There are good reasons to believe that this assumption is fulfilled. The German economy has changed a lot between 1900 and 2005 due to technological change and the enormous decline in transport costs, among other factors. Furthermore, two world wars gave rise to large temporary shocks that affected the German economy. 10 Table 1 summarizes the regression results with no differentiation across transition types, focusing on the estimates of the second stage and reporting bootstrapped standard errors to account for the two-stage nature of the regression approach. 11 We consider region fixed effects u r and, as a robustness check, region-time fixed effects u rt as a dependent variable. In the first column, a rather simple model that includes only worker characteristics is estimated in the first stage. In line with previous studies, we detect a highly significant positive effect of density on wages. The coefficient slightly declines if we augment the model by including information on the labour market biography (column 2).

RESULTS
In column (3) we also take into account possible sorting on unobserved worker characteristics by estimating a model with worker fixed effects. In the present setting, fixed effects imply that we can only consider workers with a minimum of two transitions to employment. This approach significantly reduces the number of observations on the first stage (see Table A4 in Appendix A in the supplemental data online). The corresponding estimate of the elasticity reported in column (3) confirms previous findings regarding the importance of sorting because the coefficient of the density significantly declines.
We might interpret the estimate in column (3) as a first indication for the upper bound of static urbanization effects. However, including variables on the first stage that aim at capturing localization economies and human capital externalities (column 4) in fact increases the coefficient of density in the second-stage regression pointing to a downward bias of the static urbanization effects in model (3). Additional regressions suggest that the downward bias is caused by omitting localization effects. 12 Firststage results in Table A4 in Appendix A in the supplemental data online point to a significant positive effect of the employment share of the local industry, while the number of establishments in the local industry has a dampening impact on entry wages pointing to adverse effects of local competition. The positive localization effect tends to emerge outside large urban labour markets as the employment share of the local industry shows a weak negative correlation with density. In contrast, the number of establishments in the local industry increases with the size of the local labour market. Thus, we should rather construe the elasticity in column (4) as the upper bound of static urbanization effects.
In column (5), we control for differences between East and West Germany. As regards the elasticity with respect to labour market size, this constitutes a conservative approach because the employment density of East German regions tends to be relatively low. Moreover, we include additional controls and a spatial lag of the employment density. We still detect a highly significant, though clearly reduced wage effect of agglomeration. The advantages of large labour markets seem to be highly localized, as the coefficient of the spatial lag amounts to only half the local elasticity and does not significantly differ from zero at the 10% level. 13 Overall, our preferred estimate in column (5) can be understood as a lower bound of static urbanization effects.
A drawback of the models in columns (3) to (5) is that the time-invariant region-specific effect is exclusively identified by new employment relationships that involve a change of the regional labour market, that is, migration, because worker fixed effects do not allow estimates of region fixed effects based on workers who are always observed in the same region. This can be a source of concern as migrants might not be representative of the broader worker population (e.g., Combes et al., 2011). To derive more general results, we use region-time effects as the dependent variable in column (7). Correspondingly, the number of observations in the second stage increases from 141 to 987. The impact of agglomeration is now estimated on the basis of both migrants and workers who experience a change in employment density without relocating (stayers). In our sample the share of stayers is 56% for which we observe 53% of the analyzed new employment relationships. This latter share varies between 29% and 73% across regions. Comparing the estimates in the columns (5) and (7) suggests that it is not essential whether the urbanization effect is identified on the basis of all workers or migrants only when wages of new employment relationships are considered.
Columns (6) and (8) provide results of instrument variable estimation of models (5) and (7). We instrument for both the employment density and the spatial lag of density. Several tests in the lower panel confirm our arguments regarding instrument validity mentioned above, that is, they suggest that the instruments are relevant and uncorrelated with the error term (for a detailed discussion, see the supplemental data online). This applies in particular to model (8) that in based on the region-time effects. The Angrist-Pischke F-statistics of excluded instruments and the Kleibergen-Paap Wald test indicate that the partial correlation between instruments and endogenous Does the urban wage premium differ by pre-employment status? Notes: a p < 0.1, *p < 0.05, **p < 0.01, ***p < 0.001. Bootstrap standard errors are shown in parentheses (clustered at the regional level, 500 replications). Models (7) and (8) also include time fixed effects. See Table A4 in Appendix A in the supplemental data online for first-stage results for ln(gross daily wage). F-test: Angrist-Pischke multivariate F-test of excluded instruments. Instruments: historic population density, spatial lag of the historic population density, information on soil characteristics from the European Soil Data base. First stage results of the IV estimations are summarized in Table A12 (online).
regressors is sufficient to ensure unbiased estimates and relatively small standard errors. The Kleibergen-Paap Fstatistic is above the thresholds proposed by Stock and Yogo (2005) for a maximum relative bias of 10% and 5%, respectively. 14 The Kleibergen-Paap LM test confirms the relevance of the instruments as we can reject the null that the model is under-identified at the 1% level. Finally, the results of the Sargan test suggest that we cannot reject the hypothesis that the instruments are exogenous. 15 The results of the instrumental variables (IV) regressions indicate that endogeneity due to reverse causality, omitted variables or measurement errors is unlikely to be a major problem in this setting. Comparing the ordinary least squares (OLS) and two-stage least squares (2SLS) estimates points to minor bias as the differences between the coefficients are small. 16 This applies to employment density as well as to the corresponding spatial lag and is in line with previous evidence presented by De la Roca and Puga (2017) and Combes et al. (2010). They conclude that endogeneity of region size is not a crucial issue when estimating the effects of density on individual wages. Moreover, the size of the estimated static agglomeration benefit using entry wages is of the same magnitude as estimates that use the wage at a specific date (compare review by Combes & Gobillon, 2015). 17 Table 2 summarizes the results for different groups of workers. Here we focus on the analysis of region-fixed effects since especially in the case of long-term nonemployed some region-year fixed effects would be estimated on the basis of very few observations in the first stage. Columns (1) to (3) show estimates for job-to-job transitions that clearly confirm the findings displayed in Table 1. In model (1) we do not control for localization economies and human capital externalities on the first stage, while the elasticity in model (2) is net of these effects. Again, we observe a downward bias of the urbanization effect in column (1) that can be traced back to omitted localization effects. The elasticity of wages with respect to density in column (2) is somewhat larger than the average effect identified for the entire sample of transitions (Table  1, column 5). Furthermore, the relevant spatial scale of the effects seems to be slightly more extensive for job-to-job transitions. The estimate of the spatial lag of the employment density indicates that the size of neighbouring labour markets also matters for entry wages associated with jobto-job transitions. In contrast, for the other groups, we do not find significant spillover effects (see columns 4-9).
Regarding transitions from short periods of nonemployment, we obtain (almost) the same elasticity estimate as for job-to-job transitions. However, without specific agglomeration controls on the first stage the effect is not precisely estimated. Regarding the long-term nonemployed the regression results suggest, in contrast, that these workers do not benefit from static agglomeration advantages (see columns 7-9). 18 The results of the IV regressions generally confirm the OLS estimates.
The differences between transition types might be interpreted as pointing to differentiated matching advantages as discussed in the second section. The long-term unemployed likely suffer from a significant deterioration of their specific skills and, therefore, do not achieve a better match in urban labour markets compared with similar job seekers in rural regions. In addition, the results are consistent with the idea that these workers are not able to take advantage of referrals from current employees and (former) co-workers because they are at the margin of the labour market. Brown et al. (2015) show that referred candidates are predominantly alike friends of employed workers suggesting that long-term unemployed persons benefit from referrals on rare occasions. Referred candidates are more likely to be hired and hired referred workers experience an initial wage advantage relative to non-referred workers. Dustmann, Glitz, Schönberg, and Brücker (2016) provide similar evidence on the importance of referral-based job search networks in Germany. As proximity likely impacts interactions in these social networks, referrals might be understood as one channel of static matching benefits. In fact, Dustmann et al. (2016) investigate search networks in a few metropolitan labour markets in Germany.
So far, our discussion of the regression results has focused on the elasticity of wages with respect to labour market density and on static matching benefits. However, the first-stage regressions also provide evidence of other mechanisms that generate agglomeration economies (see Table A4 in the supplemental data online). As regards learning externalities in large labour markets, we detect significant effects for all workers. Interestingly and in contrast to expectations (see the second section), we estimate the strongest impact for transitions after longer periods of non-employment. Thus, while this group does not benefit from static urbanization effects, they can take advantage of learning from working in large cities. Each additional year of work experience in large labour markets increases wages by 1.6% after long-term non-employment, while the effect amounts to 0.5% for transitions after short-term nonemployment. The results suggest that human capital that is accumulated via working in large cities might be rather general, that is, can be transferred between firms, and is less affected by a depreciation during (long) periods of unemployment than specific skills. Furthermore, the results might reflect a higher importance of such general skills (relative to specific skills) for long-term unemployed than for workers with a short period of non-employment.
Moreover, there is evidence of important localization economies that impact the entry wages. However, only for workers with job-to-job transitions and those with short periods of non-employment we observe a significant positive effect of the employment share of the local industry. This result is in line with evidence provided by Figueiredo, Guimarães, and Woodward (2014) that the quality of a firm-worker match tends to increase with firm clustering within the same industry. However, the size of the corresponding effect is moderate.
Our results also suggest that specialization per se is not beneficial. While the impact of the employment share on wages is positive, we detect a negative correlation between entry wages and the number of establishments in the local Does the urban wage premium differ by pre-employment status? Note: a p < 0.1, *p < 0.05, **p < 0.01, ***p < 0.001. Bootstrap standard errors are shown in parentheses (500 replications). See Table A4 in Appendix A in the supplemental data online for first-stage results for ln(gross daily wage), and industry, possibly pointing to adverse competition effects as discussed, for example, by Combes (2000). Finally, the regression analysis points to significant knowledge spillovers and complementarities. The share of high-skilled workers in the firm and the local industry tends to increase the wages associated with some transitions to employment. In contrast, evidence on an important impact of industrial diversity of the local economy on wages of new employment relationships is rather weak. With respect to the interpretation that the estimates and in particular the differences across transition types possibly point to static matching benefits, there are some caveats. We try to control for other static and dynamic effects of agglomeration by considering the wage of newly established employment relationships and by including control variables. However, we cannot rule out that our estimate of the static urbanization effects not only includes the impact of matching but also other mechanisms related to agglomeration. For instance, the productivity effect of sharing a suitable infrastructure endowment likely shows up immediately after the establishment of an employment relationship. We might somehow capture the impact of a specialized infrastructure (localization economies) by including the local size of the industries in the first-stage regression. In contrast, the influence of general infrastructure facilities cannot be differentiated from the static matching effects in this analysis.
This also refers to the local monopsony power of the firm. Combes and Gobillon (2015) note that regional wage differences might, to some extent, reflect spatial variation in the degree of competition in local labour markets. If the monopsony power of firms decreases with the size of the local market, higher wages observed in dense urban regions might be partly caused by relatively high competitive pressure in these regions. However, the importance of monopsony effects should decline with increasing labour mobility, that is, workers should, ceteris paribus, move to locations characterized by a relatively little monopsony power. Moreover, the relocation of firms is also relevant in this context. Firms might move to regions that offer higher mark-ups of productivity over wages. Thus, firm and worker mobility should decrease the differences in monopsony power and the importance of corresponding wages disparities across regions.
Finally, the pre-employment status might be just a proxy for the ability of workers and, thus, the detected effect heterogeneity might be due to skill differences rather than disparities in pre-employment status. However, we control for observed qualification and unobserved ability in the regression model. Moreover, the correlation between qualification and pre-employment status is rather moderate (see Table A8 in Appendix A in the supplemental data online) and evidence for important differences in elasticities across skill levels is weak for entry wage in Germany (see Table A9 online). Complementary regression results indicate, furthermore, that significant differences in the size of the density effect across pre-employment status are also visible if we allow for heterogeneous effects with respect to qualification simultaneously in a one-stage regression. F-tests on equality of the effect of employment density show that there are significant differences between job-to-job transitions and the other types of transitions in all skill groups (see Table A10 online). However, there is no robust evidence on differences between transitions from short-and long-term unemployment.

CONCLUSIONS
In this paper we investigate the importance of static urbanization effects for regional wage disparities. In contrast to previous studies, we focus on the impact on wages in newly established employment relationships and consider differences with respect to pre-employment status. Descriptive evidence shows that the gap in entry wages of former long-term unemployed and other workers increases with the employment density of local labour markets. One possible explanation is that workers derive heterogeneous benefits from urbanization depending on their pre-employment status.
Altogether, the regression analyses provide robust evidence of a positive effect of employment density on entry wages. Our results are in line with previous findings that the advantage of working in a large urban labour market includes various components such as static urbanization effects that materialize instantaneously and might be caused by matching advantages and localization effects as well as dynamic agglomeration economies, for example, due to learning externalities.
With regard to the static urbanization effects, our preferred estimates indicate, however, that we should not overstate their size (cf. Baum-Snow & Pavan, 2012; Glaeser & Maré, 2001). The results suggest that doubling the employment density increases the entry wage by 1.0-2.6%. Hence, the effect is of similar magnitude as those identified in comparable studies which also consider worker fixed effects and apply instrument variable estimation techniques (Combes & Gobillon, 2015).
Moreover, the differences across types of transitions to employment show that not all workers benefit from static urbanization advantages. While we detect significant wage effects for job-to-job transitions and after short periods of unemployment, workers do not seem to receive a higher entry wage in a large than in a small labour market after a long spell of non-employment. This result is in line with theoretical arguments proposed by Kim (1989Kim ( , 1990 that the advantages of a large urban labour market materialize when specialized workers are matched with the heterogeneous skill requirements of firms. The depreciation of human capital after an extensive period of non-employment might inhibit matching benefits. Likewise, there is no indication of important human capital externalities or localization economies for this group of workers. So in this respect, agglomeration economies seem to explain part of the heterogeneous disparities in labour market outcomes between former long-term unemployed and other workers across space. However, this does not imply that agglomeration economies do not matter at all after a significant career interruption. In fact, we find that important Does the urban wage premium differ by pre-employment status? remarks and suggestions on an earlier version of the paper. Furthermore, they benefited from stimulating discussions with seminar participants at the Institute for Employment Research, Kiel University, the 18th Uddevalla Symposium, the 8th Summer Conference in Regional

DISCLOSURE STATEMENT
No potential conflict of interest was reported by the authors.

NOTES
1. For Germany, Lehmer and Möller (2010) find that only dynamic effects seem to matter after firm size and individual fixed effects are taken into account. They distinguish two types of labour markets (urban and rural) when analyzing changes in the wages of workers who move between regions (and firms). 2. Firm units located in different municipalities are considered independent establishments. Unfortunately, it is not possible to identify whether different establishments belong to the same firm. 3. The delineation of these regions is based on commuter flows. See Kosfeld and Werner (2012) for a detailed description. 4. Short-term unemployment is defined to last up to 12 months, while long-term non-employment exceeds a period of 12 months. The latter group is likely the most heterogeneous because it encompasses long-term unemployed workers and those who have been inactive for at least one year, for example, due to parental or medical leave. However, a new employment relationship after non-employment since the year 2000 or even longer is not considered in the analysis; likewise, the first employment spell in a person's working life. 5. We apply a two-stage approach because the computation of standard errors poses a problem in the common one-stage regression model. Owing to a significant number of mobile workers in our data set, the covariance matrix has a complex structure. See Combes et al. (2008) and in the supplemental data online for a more thorough discussion of this issue. 6. We control for skill-specific unemployment rates because there is an extensive literature on the wage curve suggesting a robust negative relationship between wages and unemployment (Blanchflower & Oswald, 1990). Baltagi, Blien, and Wolf (2009) provide corresponding evidence for Germany. 7. See Table A1 in Appendix A in the supplemental data online for a detailed description of the variables. 8. This feature contrasts with most previous studies that use information on employment at a reference date. 9. We gratefully thank Malte Reichelt for providing the information from the European Regional Soil Database at the level of the labour market regions. 10. See in the supplemental data online for a more detailed description. Moreover, Combes et al. (2010) provide a thorough discussion of the validity of the applied instruments. 11. Robust and clustered standard errors are of similar size. This also applies to the non-parametric covariance matrix estimator introduced by Driscoll and Kraay (1998). The first-stage estimates of different specifications are summarized in Table A4 in Appendix A in the supplemental data online. 12. Corresponding second-stage results are available from the authors upon request. 13. We restrict spillover effects to neighbouring labour markets that share a border. It is noteworthy that this fairly simple model has considerable explanatory power as indicated by the adjusted R 2 . 14. With two endogenous regressors and 38 excluded instruments, the critical values are 20.97 for a maximum bias of 5% of the instrumental variables (IV) estimator relative to the ordinary least squares (OLS) and 11.02 for a maximum bias of 10%. 15. The results of the Sargan test are displayed because we use bootstrapped standard errors. They are confirmed by corresponding Hansen tests if we apply robust standard errors. These results are available from the authors upon request. 16. In order to check whether the results are sensitive to weak instruments, we also apply limited information maximum likelihood (LIML) estimation (see Table A5 in Appendix A in the supplemental data online). Table A6 (online), in addition, summarizes the results where we apply several combinations of historical population density and soil characteristics as instruments in order to see how they affect the pivotal coefficients as suggested by Combes et al. (2010). The parameter estimates are rather stable across the different specifications and estimation techniques, though in some specifications they are less precisely estimated. 17. Additional results based on quantile regressions (Koenker & Bassett, 1978) indicate, furthermore, that there are no important differences in the size of the static agglomeration economies along the wage distribution (see Figure A3 in Appendix A in the supplemental data online).
18. It is interesting to see that the East German wage gap also differs across transition groups. The disadvantage of accepting a job in East Germany deepens as the length of the spell of non-employment increases. The results of a one-stage regression that allows for heterogeneous effects of density on entry wages suggest that the differences in estimated elasticities between long-term non-employed and the other two groups are not only economically meaningful, as indicated in Table 2, but also statistically significant (see Table A7 in Appendix A in the supplemental data online). However, as stressed in the fourth section ('empirical strategy'), the one-stage regression suffers from the fact that it is not possible to cluster standard errors appropriately. This might affect the conclusions with regard to the statistical significance of differences across transition groups.