figshare
Browse
1/1
12 files

Insufficient pollinator visitation often limits yield in crop systems worldwide: Datafiles, R code, and custom Bayesian scripts

Download all (1.6 MB)
Version 2 2024-06-24, 18:35
Version 1 2023-12-20, 17:15
software
posted on 2024-06-24, 18:35 authored by Katherine TuroKatherine Turo, James ReillyJames Reilly, Thijs Fijen, Ainhoa Magrach, Rachael Winfree

Relevant datafiles, R-codes, and custom Bayesian scripts for the manuscript, "Insufficient pollinator visitation often limits yield in crop systems worldwide" are provided here.

All raw data on insect visitation and crop yield was compiled as part of the publicly available CropPol database. Data creators/owners are listed authors for the published dataset (Allen-Perkins et al. 2022), available at https://bio.tools/croppol. For this study, we filtered the CropPol dataset to exclude studies which did not report our variables of interest. We did not include study systems that: 1) contained less than 5 fields, 2) did not record yield or insect visitation, or 3) did not record visits from some group of pollinators that was likely to be an important contributor to pollination (e.g., the survey excluded honeybees).

Survey methods varied between individual studies included in the CropPol database. Yield data was often measured in kg per unit area, but sometimes was more specific to the crop, e.g., kg per plant, fruit per branch, fruit/seed set, etc. Insect visitation was generally measured by observing or collecting pollinators on an inflorescence per unit time but sampling intensity across study systems varied. In our analyses, yield and insect data was z-scored to facilitate comparisons across studies.

Analyses were performed using R (version 4.0.4). Dataset_S1.csv contains all collated crop study systems which report insect visitation at the field level. Dataset_S2.csv contains all crop study systems which report insect visitation and yield for several transects within a crop field. These datasets correspond to the Bayesian analysis conducted for research question 1. Bayesian models were constructed and fit using the RStan package (version 2.32.3) and a custom script, see pollen_lim_bayesian_code.R script and associated stan files.

lim_out.csv dataset reports the results of our bayesian analysis and corresponds with the code file pollen_lim_main_code.R. This code file and dataset further interprets the results of our bayesian analyses and performs subsequent analyses associated with questions 2 and 3 in the main manuscript. Logistic regression models were created with lme4 package (version 1.1-34) and their AICc scores were compared with the MuMIn package (version 1.47.5).

We also collected preexisting raster data from European Space Agency’s Climate Change Initiative (ESA-CCI) Land Cover maps and we extracted landscape composition data from GeoTIFF files using the raster (version 3.6-26) and rgdal (version 1.6-3). Likewise, we collected pre-existing data on each crop field’s distance to semi-natural habitat within a 4km radius as derived from the Global Copernicus Database (100m resolution) and Giménez-García et al. 2023 (https://doi.org/10.5194/we-23-99-2023). These landcover data are available in the landcover.csv file and are used in the pollen_lim_main_code.R file when addressing research question 2.

All other details of data analyses are provided in the 'Methods' section of the main manuscript.

Funding

USDA-NIFA 2021- 67012- 35153

BiodivScen ERA-Net COFUND program

History