The data includes songs in Bengalese finches, manual annotations of the song elements, and expected validation errors.
Songs were collected from eleven birds (from Bird0 to Bird10). Data is located in the directory “Wave E The sound format is 16-bit linear PCM with sampling rate of 32 kHz.
Annotations are defined in XML format. Annotation of each sequence includes the name of wave file, position and length of the sequence in the wave file, and annotations of sound elements in the sequence. Annotation of each sound element includes position and length in the sequence and the label of the element. The schema for the XML file is in birdsong-recognition/xsd/AnnotationSchema.xsd or at http://marler.c.u-tokyo.ac.jp/files/koumura-okanoya-2016-songs/xsd/AnnotationSchema.xsd
Validation errors are the expected values that would be obtained by running the program in https://github.com/takuya-koumura/birdsong-recognition with hyper parameters provided in the source code. The program contains three algorithms for automatic recognition: "BD -> LC -> GS", "LC -> BD & GS" and "LC & GS -> BD & GS" Expected errors for each algorithm are provided in "ErrorBdLcGs.xml", "ErrorLcBdGs.xml", "ErrorLcGsBdGs.xml". Each file contains two types of validation errors: "Levenshtein error" and "matching error". Please read the manuscript for the description of the algorithms and errors. The schema for the XML file is in birdsong-recognition/xsd/ErrorSchema.xsd or at http://marler.c.u-tokyo.ac.jp/files/koumura-okanoya-2016-songs/xsd/ErrorSchema.xsd
Funding
Grant-in-Aid for Scientific Research (A) (#26240019), MEXT/JSPS, Japan, to KO. Grant-in-Aid for JSPS Fellows, MEXT/JSPS, Japan (#15J09948) to TK.