figshare
Browse

Photographs, datasets and code supporting ‘An accurate and efficient semiautomated approach to counting birds: estimating Northern Gannet colony size in Canada'

Download (29.94 kB)
software
posted on 2025-01-17, 19:40 authored by Jacob Walker, Trevor S. Avery, Francis St-Pierre, Jean-François Rail, Danielle E. A. Quinn, Matthew English, Stephanie Avery-GommStephanie Avery-Gomm

ABSTRACT

Improving the efficiency of population monitoring and conservation programs is beneficial, so long as the accuracy of the information collected is not diminished. The need to expeditiously estimate the population size of seabird colonies is especially acute during mass mortality events when aerial surveys can provide information quickly on the extent of effects and total mortality. In 2022, the Highly Pathogenic Avian Influenza virus caused outbreaks at most Northern Gannet Morus bassanus colonies worldwide, killing tens of thousands of gannets in eastern Canada. In this study, we evaluated the accuracy and efficiency of a semiautomated method using the free software CountEm for counting Northern Gannet nests by reanalysing thirteen years of aerial photographs from past population surveys (2009–2020 and 2022). We developed a protocol that generated population estimates that are accurate enough to support population management objectives (i.e., within 2–5% of manual counts) and outline additional ways to improve CountEm accuracy. Additionally, using CountEm was 1100% more efficient than manually counting based on counting time. Since CountEm relies on human identification of objects to be counted, our methods, results, and conclusions are transferable to any taxa that form large aggregations and can be identified and counted in photographs.

About this repository

This repository contains contains photographs, datasets and code supporting ‘An accurate and efficient semiautomated approach to counting birds: estimating Northern Gannet colony size in Canada’, which is published in Ecosphere. The repository can be cited as follows:

Walker, Jacob, Trevor S. Avery, Francis St-Pierre, Jean-François Rail, Danielle E. A. Quinn, Matthew English, and Stephanie Avery-Gomm. 2024. “Photographs, datasets and code supporting ‘An accurate and efficient semiautomated approach to counting birds: estimating Northern Gannet colony size in Canada’.” Figshare. https://doi.org/10.6084/m9.figshare.25483174

Photographs

This repository contains data associated with 52 composite photographs of Northern Gannet colonies at Ile Bonaventure and Rochers aux Oiseaux taken between 2009 and 2022. See the manuscript above for details.

Datasets

The raw data are provided in alldata.csv. Results of repeated CountEm runs (n = 11 photographs) are found in multipleruns.csv. The provided variable key describes variables in both data files (VariableKey.xlsx). To reproduce the analyses performed in this study, use the code provided in reproducible_analysis.R.

Code

The code required to reproduce the double count analysis is in reproducible_analysis.R, using the input files data/rawdata.csv and data/multipleruns.csv. The code, including detailed comments, is organized into 8 sections:

  1. Load Packages: loads the required packages (see Sofware requirements, below)
  2. Import, Restructure, and Subset Data: five subsets of the available data are created to faciliate analyses in sections 3-8
  • df: requires data/rawdata.csv; results from the first CountEm run for each photo using 300 quadrats, only considering AOTs (52 rows, 22 columns)
  • df_500: requires data/rawdata.csv; CountEm results from a subset of 12 photos using 300 and 500 quadrats, only considering AOTs (12 rows, 10 columns)
  • df_dead: requires data/rawdata.csv; results from the first CountEm run for each photo using 300 quadrats, only considering dead birds (4 rows, 22 columns)
  • mdf: requires data/multipleruns.csv; results from ten CountEm runs for a subset of 11 photos, only considering AOTs (110 rows, 22 columns)
  • mdf_dead: requires data/multipleruns.csv; results from ten CountEm runs for a subset of 11 photos, only considering dead birds (30 rows, 22 columns)

Sections 3-8 are used to generate the results found in the corresponding Results subheaders of the text:

  1. Results: CountEm Accuracy: uses the data object dfto assess the use of CountEm to estimate the number of AOTs; produces Figures 3 and 4
  2. Results: Increasing CountEm Quadrats: uses the data object df_500 to assess the impact of increasing the number of CountEm quadrats from 300 to 500
  3. Results: Estimating the Number of Dead Birds: uses the data object df_dead to assess the use of CountEm to estimate the number of dead birds
  4. Results: Accuracy of Multiple CountEm Runs: uses the data object mdf to run a resampling routine and assess the use of multiple CountEm runs to estimate the number of AOTs
  5. Results: CV to Inform Number of CountEm Runs: uses the results of the simulation in section 6 to determine if the coefficient of variation (CV) can be used to determine the number of CountEm runs that could be summarised to generate estimates within 5% of the manual count of AOTs
  6. Results: Efficiency: uses the data objects df and mdf to summarise the user time required to apply CountEm to generate estimates of the number of AOTs

Software requirements

Scripts are written for R v4.3.1. See scripts and manuscript for packages and software citations.

Required R packages can be installed in R with:

install.packages(c("tidyverse", 
"readxl",
"BSDA",
"boot",
"ggdist",
"ggeffects",
"emmeans",
"marginaleffects"))

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC