RETRACTED DATASET: Paired watershed study data and related statistical model predictions to investigate the impact of forest removal and planting on water yield

2019-06-17T14:52:13Z (GMT) by Jaivime Evaristo Jeffrey J. McDonnell
The data files described in this data record were retracted on 3 March 2020, associated with the retraction of the related article. The article retraction note can be found here: For full transparency, we leave the retracted data in place. The data file errors fall into three categories: (1) wrong calculation of water yield using the reported values in the source literature; (2) disparate study designs that proved limiting in the categorical binning of the type of forest treatment or ground cover change; (3) epistemic uncertainty in the source papers. An example of the latter is, after our paper was published, we were informed that the underlying data in one of the source papers showed completely opposite trends to the trends reported in the source paper. We caution against use of these data for any further analysis. To that end, we are working on a new data compilation together with all parties associated with the Matters Arising and Retraction and will alert the community to its availability here when available.

The description below remains unchanged

This dataset contains two .xlsx spreadsheets and two .txt files relating to the prediction of streamflow response to forest cover management.

The two .xlsx spreadsheets comprise a Paired Watershed Studies (PWS) database for 502 catchments, tabulated as 251 treatment-control catchment pairs, as follows:
- pws data planting.xlsx: data compiled from 90 paired watershed studies in which the intervention schemes involved planting (conversion, regrowth, afforestation/forestation). References to the original studies are provided, along with pertinent data such as site location, catchment area and water yield response.
- pws data removal.xlsx: data compiled from 161 paired watershed studies in which the intervention schemes involved removal (deforestation). The spreadsheet layout is identical to pws data planting.xlsx.
The two .txt files contains outputs of statistical models aimed at predicting water yield response. These are also spreadsheets, but are stored as .txt due to their large size. Contained data are as follows:
- pws model complete.txt: model predictions for >400 K catchments worldwide where data for all predictor variables are available. Predictor variables were: potential storage, PET (potential evapotranspiration), AET (actual evapotranspiration), rootzone storage, runoff coefficient, permeability, catchment area.
- pws model complete_incomplete.txt: model predictions for >2 million catchments worldwide. This includes catchments where data for all predictor variables are available ('complete') and not available ('incomplete').

The related study was a global synthesis work on PWS--which are watershed studies in which one watershed serves as a reference while the adjacent watershed(s) are treated by various forest management approaches, such as forest harvesting, conversion, afforestation. The authors aimed to assess the factors controlling streamflow response to forest planting and removal. They introduced a vegetation-to-bedrock model to explain the impacts of forest removal and planting on water yield.

Acronyms: PWS=paired watershed studies; AET=actual evapotranspiration; PET=potential evapotranspiration; P=precipitation; SDG=sustainable development goal; BRIC+US=Brazil, Russia, India, China and the United States; IQR=interquartile range; SFRA=streamflow reduction activities; RASE=root average squared error; AAE=average absolute error; RC=runoff coefficient