figshare
Browse
1/2
22 files

Machine learning derived daily PM2.5 concentration estimates from by County, ZIP code, and census tract in 11 western states 2008-2018

dataset
posted on 2021-02-04, 17:11 authored by Colleen ReidColleen Reid, Melissa Maestas, Ellen ConsidineEllen Considine, Gina Li

We created daily concentration estimates for fine particulate matter (PM2.5) at the centroids of each county, ZIP code, and census tract across the western US, from 2008-2018. These estimates are predictions from ensemble machine learning models trained on 24-hour PM2.5 measurements from monitoring station data across 11 states in the western US. Predictor variables were derived from satellite, land cover, chemical transport model (just for the 2008-2016 model), and meteorological data. Ten-fold spatial and random CV R2 were 0.66 and 0.73, respectively, for the 2008-2016 model and 0.58 and 0.72, respectively for the 2008-2018 model. Comparing areal predictions to nearby monitored observations demonstrated overall R2 of 0.68 for the 2008-2016 model and 0.58 for the 2008-2018 model, but we observed higher R2 (> 0.80) in many urban areas. These data can be used to understand spatiotemporal patterns of, exposures to and health impacts of PM2.5 in the western US where PM2.5 levels have been heavily impacted by wildfire smoke over this time period.


Funding

This work was supported by Earth Lab through the University of Colorado Boulder’s Grand Challenge Initiative.

History