3 files

Acoustic Censusing and Individual Identification of Birds in the Wild - Supplementary Information (Data)

posted on 2021-10-27, 03:14 authored by Carol L. BedoyaCarol L. Bedoya, Laura E. Molles

This dataset contains 849 individualized calls (1-min length, .wav) from 30 roroa individuals (Great spotted kiwi, Apteryx maxima). It is the Supplementary Information for "CL Bedoya and LE Molles. (2021). Acoustic censusing and individual identification of birds in the wild.


This .rar file contains everything needed for replicating the results of our manuscript directly from the acoustic data. All are 1-min .wav files corresponding to 30 Roroa (Great spotted kiwi) Individuals.

Put this folder in “C: \” If you don’t want to change the folder locations in the algorithms.

1) The folder “Censusing” contains the data used for testing the acoustic censusing capabilities of our framework, i.e., 10 individuals (6 males and 4 females).

2) The folder “All” contains all data together (known individuals + censusing, in the subfolder “Raw”). It also contains the training data (augmented) needed to replicate Figure 2 of the manuscript.

3) The folder “Unseen Individual Discovery” contains the validation data of the known individuals (10 males and 10 females) and novel (unseen) information from six individuals (3 males and 3 females).

4) The folder “Known Individuals” contains five subfolders:

a) The folder “Raw” contains all the data from the known individuals (10 males and 10 females).

b) The folder “Raw_Separated_in_training_and_validation” contains all the calls divided in training (70%) and validation (30%) datasets. This is done with an algorithm provided in SI Code (GitHub link in references).

c) The folder “Augmented_training” contains the training data from the known individuals (70% of all known) and thousands of noisy segments from different sites. The algorithm to perform the data augmentation is provided in SI Code (GitHub link in references). The folder also contains the data already augmented in the subfolder “Outputs”.

d) The folder “Data_used_for_training_CDCN” contains the training data (70%) augmented and the Validation data (30%).

5) The folders “Generated Images” in “Known Individuals” and “Data”, are purposely empty so they can get filled once the algorithms start running.

Please note that there is some redundancy in the data. We deliberately did this so that the user can properly understand the data flow. The folder “Raw” in “All” contains all the data. You can replicate all the processes described above from zero using these data and the algorithms provided in the supplementary information of our manuscript (GitHub link in references). Combine it, play with it, and make your own conclusions.


Please cite this data as: CL Bedoya and LE Molles. (2021) "Acoustic censusing and individual identification of birds in the wild".

This work is based on/includes the Department of Conservation’s information which is licensed by the Department of Conservation for re-use under a Creative Commons Attribution CC BY 4.0 International License.


This research was supported by Verum Group
