Trends in Earnings Volatility Using Linked Administrative and Survey Data

Abstract We document trends in earnings volatility separately by gender using unique linked survey data from the CPS ASEC and Social Security earnings records for the tax years spanning 1995–2015. The exact data link permits us to focus on differences in measured volatility from earnings nonresponse, survey attrition, and measurement between survey and administrative earnings data reports, while holding constant the sampling frame. Our results for both men and women suggest that the level and trend in volatility is similar in the survey and administrative data, showing substantial business-cycle sensitivity among men but no overall trend among continuous workers, while women demonstrate no change in earnings volatility over the business cycle but a declining trend. A substantive difference emerges with the inclusion of imputed earnings among survey nonrespondents, suggesting that users of the ASEC drop earnings nonrespondents.


Introduction
Understanding the level and trend of earnings volatility is important both in its own right, and because of its potential contribution to rising inequality (Gottschalk and Moffitt 2009). Much of what we know about volatility in the United States has come from survey data, which is generally advantageous because it offers a broad collection of variables, a long time series, population representativeness, and widespread availability to the research community. However, survey data suffers from data quality issues such as nonresponse and measurement error, the latter of which may include response error or survey reporting policy such as topcoding (Mellow and Sider 1983;Lillard, Smith, and Welch 1986;Bollinger 1998;Bound, Brown, and Mathiowetz 2001;Roemer 2002;Hirsch and Schumacher 2004;Meijer, Rohwedder, and Wansbeek 2012;Bollinger et al. 2019). More recently, some scholars have turned to administrative data to examine volatility on the belief that it avoids some of the pitfalls of surveys (Sabelhaus and Song 2010;Bloom et al. 2018;Carr and Wiemers 2018). However, the assumption that administrative data serve as a so-called gold standard has been challenged by some (Kapteyn and Ypma 2007;Abowd and Stinson 2013), and the populations covered between the survey and administrative samples are often quite different. Indeed, as discussed in the accompanying Overview paper in this volume, the current literature has reached differing conclusions on the trend in earnings volatility in comparing survey-alone to administrative-alone estimates. It is difficult to know how much of the difference in trends is due to measurement between survey and administrative reports, as opposed to differences in samples. In this article we offer new estimates and a direct comparison of volatility trends in survey and administrative data by using restricted-access survey data from the Current Population Survey Annual Social and Economic Supplement (CPS ASEC) linked to the same individuals in the Social Security Administration's Detailed Earnings Record (SSA DER) for the period spanning calendar years 1995-2015. The ASEC is a large, nationally representative survey that serves as the source of official statistics on poverty and inequality, and is the workhorse dataset for research on earnings determinants. The DER reflects earnings reports provided by employers and the self-employed for purposes of payroll taxation and eligibility for Social Security retirement and disability programs. While the ASEC is used primarily for repeated cross-sectional analyses, its rotating survey design permits matching a subsample of respondents from one year to the next, and thus can be used to construct simple measures of volatility as utilized in a number of prior studies (Gittleman and Joyce 1996;Cameron and Tracy 1998;Ziliak, Hardy, and Bollinger 2011;Celik et al. 2012;Koo 2016). The consensus on the ASEC-based papers was a strong increase in male earnings volatility in the 1970s, peaking in the 1980s, and stabilizing at that higher level thereafter until the Great Recession. The few papers on women find a very different pattern of a trend decline in volatility since the 1970s. The key advance of this paper over the prior ASEC literature is our exact link to the DER, permitting us to focus on differences in volatility trends emanating from measurement between survey and administrative data while holding constant any differences due to sample frames.
We begin with a baseline sample of men and women who report positive earnings in each of two consecutive years in the ASEC and who have a valid link to the DER in both years. This sample offers the most direct comparison of survey and administrative estimates of volatility. We then sequentially relax a number of assumptions from the baseline sample. First, we test whether volatility estimates differ in survey and administrative data with the inclusion of zero earnings, which could emanate because people are true nonworkers and thus have no earnings in either the ASEC or DER, or because some report zero earnings to the tax authorities but report positive values to the survey representative, or vice versa (Ziliak, Hardy, and Bollinger 2011;Koo 2016). Next, because imputation of missing earnings reports in the ASEC is high and has been rising over time, and previous work has shown that inclusion of such imputations can lead to significant bias in estimates of earnings inequality and regression coefficients (Hirsch and Schumacher 2004;Hokayem, Bollinger, and Ziliak 2015;Bollinger et al. 2019), we relax the requirement that ASEC participants report their earnings on the survey. The third measurement test we conduct is whether requiring the administrative data link leads to a nonrepresentative sample of the underlying population and thus possible biased estimates of volatility. Finally, because the ASEC does not follow movers from one wave to the next, we test for potential attrition bias in our ASEC volatility estimates. The supplementary materials contain further robustness checks beyond those reported herein.
Our results for both men and women show that the level and trend in volatility is similar in the ASEC and DER, suggesting no bias from use of survey reports for earnings volatility research in the ASEC. Qualitatively, we find substantial business-cycle sensitivity among men, especially during the Great Recession, but no cyclical response among women. This corroborates the prior work in Ziliak, Hardy, and Bollinger (2011) and Koo (2016), but with the longer sample period we find that the increase in male earnings volatility in the Great Recession was temporary and the level fully returned to that from two decades prior, while women continue their secular decline in earnings volatility. We do find rising male earnings volatility among men when we include persons with zero earnings, but again no substantive difference between survey and administrative data. Moreover, the volatility levels of attriters exceeds that of nonattriters, but there are no differences in trends. The one area where survey volatility estimates depart from the administrative data is when we include earnings imputations in the ASEC. This results in substantially higher levels of volatility, and an increasing trend among men, adding further evidence on the need to drop imputed values from earnings research in the ASEC.

Measuring Volatility
We adopt a summary measure of earnings volatility as the variance of the arc percent change, defined as whereȳ i is the average (absolute value) earnings across adjacent years,ȳ i = y it +y it−1 2 (Ziliak, Hardy, and Bollinger 2011; Dynan, Elmendorf, and Sichel 2012; Koo 2016). Because earnings volatility can be affected by life-cycle factors (Gottschalk et al. 1994), we first regress the arc percent change on a quadratic in age year-by-year and then use the estimated residuals in Equation (1) prior to constructing the variance.
The advantages of the arc percent measure are 2-fold. First, it is bounded between ± 200%, easing interpretation. Second, the arc percent change can also be calculated if one of the earnings observations is zero, which is not possible using other common volatility measures such as the variance of the change in log earnings (Shin and Solon 2011;Moffitt and Zhang 2018). The latter restriction could be important because as highlighted recently in Blundell et al. (2018) and Abraham and Kearney (2020), employment rates have declined for men for the past 40 years, especially for low skilled males, while employment for women has declined since the peak in the late 1990s. Hence, a larger proportion of earners will have zero earnings in some years, and removing these earners likely understates true earnings volatility levels. Whether movements in and out of the labor force contribute to trends in volatility depends on whether those transitions are trending upward. Moreover, a loose attachment to the labor force may lead to misreporting of earnings in survey data, or may lead to missing earnings from uncovered or informal labor markets. Both of these factors could contribute to differences in the earnings volatility measures between survey and administrative data. However, as we demonstrate in the supplementary materials, using the difference in logs yields similar estimates as the arc percent change measure once we omit those with zero or negative earnings from the arc percent.
The data used in estimating Equation (1) are restrictedaccess ASEC person records linked to the DER for survey years 1996-2016 (reporting earnings for tax years 1995-2015). Our sample consists of men and women between the ages of 25 and 59 who are not full-time students in any year or that have their entire ASEC supplement allocated. Some individuals respond to the monthly core of the CPS, but are unwilling or unable to provide a response to the ASEC supplement. For these cases, Census uses a sequential hot-deck procedure to replace the individual's entire ASEC supplement with a donor's supplement (called a whole imputation). During our sample period, roughly 12% of individuals had their entire ASEC imputed and so we drop these individuals, though we explicitly adjust for this in some of our analyses below. Following the practice of the other volatility papers in this volume, we trim the top and bottom 1% of the real annual cross-sectional ASEC and DER earnings distributions prior to estimating the age-adjusted arc percent change. In the online supplement, we also present baseline estimates with a 5% trim, without trimming as advocated in Bollinger and Chandra (2005), and using volatility not adjusted for age, and find none of these alternatives to affect our estimated trends in volatility. That supplement also provides additional details on the ASEC-DER linkage process, and how we construct the two-year panels.

Results
The baseline sample are those men and women who have positive earnings in both years and in both the ASEC and DER, are respondents to the ASEC earnings questions and thus do not have imputed earnings, and have a link to the DER. We refer to this group as the linked respondent sample. The linked respondent sample is intentionally restrictive because we wish to conduct a direct comparison of volatility estimates from survey data against administrative data with a sample and measure as similar as possible. We then broaden the sample in stages. First, we expand the sample to include those who have zero earnings in one of the two years. Second, we include those who did not respond to the ASEC earnings questions in one or both years and thus have imputed earnings, but still requiring the DER linkage in both years. Third, we then expand the sample further by including those who did not have a link. Finally, we include those who were missing from the ASEC in the second year because of attrition and examine DER volatility in that sample. The online supplement contains summary statistics for the baseline sample as well as for the other samples used in the analysis. Figure 1 presents the baseline series of earnings volatility, with men on the left panel and women on the right panel. In addition to the first year and last year depicted on the x-axis of each panel, we also highlight the recessionary year of 2001 and the Great Recession years of 2007-2009. There is a notable uptick in male earnings volatility in the years surrounding recessions, especially the Great Recession, but there was a return to prerecession levels in the subsequent recovery. Thus, male earnings volatility among continuous workers over the last two decades is largely a business-cycle effect with no trend increase or decrease. Moreover, while there is a somewhat heightened cyclical sensitivity in the DER compared to the ASEC, there is no substantive discrepancy between the survey and administrative data in the overall level and trend.
The right panel of Figure 1 shows that women's earnings volatility differs from men's in the level, trend, and cyclicality. Women have higher levels of earnings volatility in each corresponding year compared to men, but because there is a trend decline, women's volatility is converging toward those levels found among men. The other important contrast with men is the lack of business-cycle induced volatility of women's earnings. Importantly, though, similar to men we find no substantive difference in women's earnings volatility whether we measure it in the ASEC or the DER.

Volatility with Zero Earnings
One of the aims of this research is to capture a broad measure of volatility in the labor market, including the impact of movements in and out of employment across years. The arc percent measure of volatility accommodates zero earnings in one of the two years, and as such our first robustness check on the baseline volatility estimates in Figure 1 is to relax the requirement of positive earnings in both years. The ASEC records zero earnings based on self-reports, but if the person does not work in a given year they do not receive a W-2 or 1099 tax form and do not show up in the DER. Thus, for those persons who have a link to the DER in one year, but are missing the DER in the year before or after, then we set that missing DER value to zero prior to constructing the arc percent volatility. This treats the ASEC and DER symmetrically. Figure 2 repeats the analysis of Figure 1 but now includes those periods with zero earnings. There are several notable differences. First, for both men and women the level of volatility in any given year is at least double that in Figure 1 with zeros excluded. Second, the cyclical sensitivity of male earnings volatility is much more pronounced, especially in the years surrounding the Great Recession, and there is now some evidence of cyclicality in women's volatility. Third, male earnings volatility is trending upward when we include zero earnings- increasing about 20% over the sample period-suggesting that requiring positive earnings in both years is a selected sample, at least for volatility measurement. Despite these differences, similar to Figure 1 we find that inclusion of zeros in the ASEC yields the same outcome as in the DER, suggesting no discernable distinction in earnings volatility for men and women in survey and administrative reports even with the inclusion of zeros.
Notably, an earnings report of zero in the ASEC could be from nonwork, or it could be from misreporting by the respondent. That is, they could self-report zero earnings in the ASEC, but the firm could submit a positive earnings W2 that is included in the DER. There are reports of zero earnings in the DER, although this is rare, and likely reflects misreports on the part of the firm or self-employed worker. But it is possible that a worker could report earnings to the Census surveyor and not have those earnings reported to the tax authorities by the firm or self (for those self-employed). In the supplementary materials we expand the sample from Figure 2 by replacing reports of zero earnings in the ASEC with the positive values from the DER, and we replace missing DER values with earnings values from the ASEC. This change places the respective male and female earnings volatility series in between those found in Figures 1  and 2, but again we obtain qualitatively similar conclusions in both the ASEC and DER.

The Role of Nonresponse and Nonlink on Volatility
The ASEC sample is much broader than the linked respondent sample, and thus in this section we expand our analysis to a sample of individuals who may be an ASEC earnings nonrespondent in one or both years (and thus have earnings imputed) or who may not have a link to the DER in either or both years (but like Figure 1 we require positive earnings in both years). Similar to the whole imputes discussed above, Census also uses a sequential hot-deck procedure to impute earnings for individuals who otherwise responded to the ASEC, but did not provide a response to the earnings questions. The key assumption in the hot-deck procedure is missing at random (MAR). Bollinger et al. (2019) show that the economic consequences of the MAR assumption for earnings levels is primarily in the tails of the distribution, and here we extend that earlier analysis to earnings volatility.
In Figure 3 we estimate the effect of response and link status on earnings volatility of men in the top frame and women in the bottom frame. For each gender, the leftmost panel consists of the full ASEC, including those who both respond and do not respond to the earnings questions in the ASEC and those who are both linked and not linked to the DER. The ASEC and DER samples in the panel are not the same, because the ASEC lines include both linked and unlinked DER individuals, and the DER lines include those individuals who were linked in at least one year. In the middle panel we restrict the sample to two-year respondents regardless of whether they have a DER link (the ASEC and DER samples are therefore again not the same), while in the rightmost panel we impose the requirement that sample members be linked to the DER both years, but still including earnings respondents and nonrespondents. The figure makes clear that compared to Figure 1 including nonrespondents has a substantive effect on the level and trends of earnings volatility for both men and women in the ASEC. Volatility levels are double with nonrespondents included, and for men it results in an upward trend in volatility and for women no trend, which is distinct from the results in Figure 1 where men had no trend in volatility (see middle panel of Figure 3) and women have a negative trend. The Census hot-deck method imparts bias much like Hirsch and Schumacher (2004) and Bollinger et al. (2019) show for wage levels, but because the imputation procedure has not changed since the late 1980s, the trend increase reflects the higher share of workers with imputed earnings. Failing to link to the DER has no effect on ASEC volatility.

Sample Attrition and Volatility
A possible concern with matched ASEC is with sample attrition affecting our earnings series. Moves are more likely among low-income families whose earnings are more volatile, which means we could understate the level and trends in volatility with our sample. Under the assumption that the probability of attrition is unobserved and time invariant (i.e., a fixed effect), or trending very slowly over time, then first differencing earnings as used in the volatility measures based on log-differences will remove the latent probability of attrition and purge estimates of possible attrition bias (Wooldridge 2001). However, because the arc percent includes mean earnings in the denominator then potential attrition bias could remain in the estimates. A conservative interpretation is that data from matched ASEC provides estimates of earnings volatility among the population of nonmovers.
To examine the potential role of attrition on volatility, we expand our dataset to include not only those matched across   Current Population Survey, 1996-2016 Annual Social andEconomic Supplement. years in the ASEC, but also those individuals observed in year 1 of the ASEC but not year 2. The online supplement reports the year 1 socioeconomic characteristics of attriters and nonattriters, showing that attriters are younger, more likely to be a member of a minority racial group, have fewer years of school, less likely to be married (though with a higher percentage of married but with spouse absent), work fewer weeks and hours per week, have lower earnings in both the ASEC and DER, and higher rates of earnings (item) nonresponse. These patterns hold for both men and women, and suggest that volatility is likely to differ between attriters and nonattriters.
Because we have DER reports for both ASEC attriters and nonattriters, in Figure 4 we depict the volatility series for each group of men and women. The figure makes abundantly clear that volatility among attriters is substantively elevated compared to nonattriters, but the trends are similar-stable for men and declining for women. This suggests that volatility levels among ASEC stayers are too low, consistent with the results reported by Fitzgerald, Gottschalk, and Moffitt (1998) for the PSID, but the trends are unaffected by attrition.
One potential solution to address sample attrition in the ASEC is to reweight the data using inverse probability weighting (IPW). IPW is a general solution to attrition and nonresponse when the data are missing at random (Wooldridge 2007). Although there is evidence that the MAR assumption is violated in earnings levels (Bollinger et al. 2019), this does not mean it is violated for higher moments, though it is beyond the scope of this paper to formally test the MAR assumption. We proceed by estimating probit models each survey year of the probability that the person is (i) not a whole impute, (ii) is linked to the DER, (iii) is an earnings respondent, and (iv) is matched across ASEC waves as a function of a rich set of socioeconomic characteristics in both levels and interactions. We then divide the ASEC supplement weight by the fitted probability of response + link + match and estimate the IPW volatility series. The results of reweighting the ASEC are reported in Figure 5, along with original series from Figure 1. The figure shows that reweighting the ASEC does result in a higher level of volatility in each year, but likely does not fully adjust given the wide divergence between attriters and nonattriters in the DER shown in Figure 4. However, it is important to once again emphasize that attrition does not affect volatility trends of men and women.

Comparison to Common Measures and Samples in the Literature
The supplementary materials contain a number of robustness checks to the baseline estimates from the linked respondent sample depicted in Figure 1. This includes the frequently used measure of volatility in the literature of the variance of log earnings growth, comparisons to the PSID sample by restricting the analysis to household heads, nonimmigrants, not self-employed, and private sector workers, and alternative approaches to trimming the data to mitigate the influence of outliers. The key takeaway from these alternative specifications is that the volatility levels and trends in the ASEC and DER align.

Conclusion
This article presented new estimates of earnings volatility of men and women using unique restricted-access survey and administrative tax data for the tax years spanning 1995-2015.
The linked survey-administrative sample eliminated potential differences due to overall sampling frame issues. As we varied the samples based on survey responses, we consistently found no significant trend in male earnings volatility over the last two decades, and a negative trend among women. The exception among men was when we include periods of zero earnings, where we find an upward trend in earnings volatility. However, even with zeros included, the levels and trends of volatility were qualitatively, and usually quantitatively, the same in both survey and administrative data. The one departure from this latter result was when we included Census-imputed earnings in our survey samples, which resulted in an upward trend in volatility among men and a stable trend among women. Thus, differences between survey and administrative data are dominated by earnings item response issues. Our recommendation for users of the public versions of the ASEC for volatility research is to drop both those observations whose entire supplement is imputed, as well as those whose earnings are imputed. The remaining sample will yield estimates that align with administrative tax records.

Supplementary Materials
The supplementary materials included in the zip file ZHB_programs_ supplement.zip include a PDF file as a supplementary appendix to the published paper entitled ZHB_JBES_unblinded_Supplement_Final.pdf. This supplement contains a description of the data, along with a number of robustness checks. In addition the zip file contains a series of Stata DO files for estimation of our results. The files FiguresX_clean.do (for X = 1 − 5) produce each of the respective Figure 1