The dataset accompanying this study provides comprehensive resources for replicating and building upon the image classification and ecological analyses described in the manuscript. It consists of the following files:
metadata.csv: This file contains detailed explanations of all columns in the accompanying datasets, including descriptions, units, and data formatting. It serves as a guide to understanding and interpreting the data structure and content.
parnassius.csv: This dataset includes annual population estimates for the alpine butterfly Parnassius spanning the years 2013 to 2023. It also contains corresponding snow cover averages at both the meadow and quadrat levels, derived from satellite imagery. These data were used to assess how snow cover influences butterfly population dynamics.
sedum.csv: This dataset provides annual population estimates for Sedum, the host plant for Parnassius, over the same 2013–2023 period. Similar to the butterfly dataset, it includes snow cover averages at meadow and quadrat scales, facilitating analyses of how snow dynamics affect host-plant densities.
randomforest_snowClassifier.Rdata: This file contains the trained Random Forest model used for snow classification. The model was built using a supervised machine-learning approach, with snow/no-snow labels derived from Landsat 8 and 9, and Sentinel-2 imagery processed via Google Earth Engine. The classifier achieved a 98% accuracy and was validated against ground-truthed snow measurements.
All datasets were generated to support the study’s workflow, which integrates image classifiers with ecological research to test hypotheses on species-environment interactions. Satellite images were preprocessed to extract snow cover metrics, and population estimates were derived using established ecological survey methods. By combining these datasets, we demonstrate how image classification outputs can be linked to ecological variables to investigate the impacts of snow cover on species dynamics.