figshare
Browse
1/1
2 files

ABC2018 dataset

Version 2 2019-05-03, 01:18
Version 1 2019-04-03, 06:44
dataset
posted on 2019-05-03, 01:18 authored by Toru TamakiToru Tamaki, Ken Yoda
This is the dataset for the competition ABC2018. The competition website is:
https://competitions.codalab.org/competitions/16283

File

A zip file contains all the training and test trajectories, and a ground truth label file for the training set.

When you unzip the dataset, you find

./test/***.csv
./train/***.csv
./train_labels.csv
where *** is the trajectory number (000, 001, ..., 630 for train, 000, 001, ..., 274 for test).


Task
Classifying GPS trajectories of birds into male or female

Trajectory file format

A single CSV file (000.csv, 001.csv, ...) contains a trajectory of a trip, and each line represents the information of a GPS location of a shearwater. In addition to longitude and latitude, some other information is provided; elapsed time and local clock time, solar azimuth and elevation angles.

- float: longitude
- float: latitude
- float: sun azimuth [degree] clockwise from the North
- float: sun elevation [degree] upward from the horizon
- int: (1) daytime (between sunrise and sunset), or (0) nighttime
- int: elapsed time [second] after starting the trip
- clock: local time (hh:mm:ss)
- int: days (starts from 0, and increments by 1 when the local time passes 23:59:59)

Float values are of the format %.5f, and fields are separated by a single comma.

Here is an example:

=================
139.29220,38.56632,76.42170,-4.45122,0,0,04:54:03,0
139.29300,38.56763,76.58196,-4.25726,0,60,04:55:03,0
139.29400,38.57053,76.73674,-4.06880,0,118,04:56:01,0
139.29620,38.57563,76.89729,-3.87201,0,178,04:57:01,0
...
=================

- Different trajectories have different number of GPS locations.
- The time interval between successive two GPS locations is approximately one minute (60 seconds) when GPS works well, otherwise interval may vary from one to several minutes, even hours and days.
- Trajectories in the training and test sets are in the same format.
- Ground truth labels for the training set are given in a separate file.

Labels: gender, or male/female

A single txt file of ground truth labels of the training set is provided. Each line has the label of the corresponding training trajectory; that is, line 0 is the label of the training trajectory file 000.csv.

Label is binary (character):

- male: 0
- female: 1

Here is an example:

=================
1
1
1
0
1
0
0
...
=================


Stats: Numbers of the dataset

Training set
- 326 male trajectories
- 305 female trajectories
- 631 in total

Test set
- 275 trajectories



Disclaimer
The procedures used in the field study for collecting the data were approved by the Animal Experimental Committee of Nagoya University.

License of the dataset
The dataset was collected by scientific teams for scientific purpose. If you use the dataset for any scientific purposes except this competition, please refer the following paper:

Sakiko MATSUMOTO, Takashi YAMAMOTO, Maki YAMAMOTO, Carlos B ZAVALAGA and Ken YODA (2017) Sex-related differences in the foraging movement of Streaked Shearwaters Calonectris leucomelas breeding on Awashima Island in the Sea of Japan. Ornithological Science 16(1):23-32. 2017 doi: http://dx.doi.org/10.2326/osj.16.23

Please contact the corresponding researcher, Ken Yoda (http://yoda-ken.sakura.ne.jp/yoda_lab/English.html), if you would like to use the dataset for any other purposes, or access un-preprocessed original raw data.

Funding

JSPS KAKENHI Grant Number JP16H06535

JSPS KAKENHI Grant Number JP16K21735

JSPS KAKENHI Grant Number JP16H06541

JSPS KAKENHI Grant Number JP16H06540

JSPS KAKENHI Grant Number JP16H06539

JSPS KAKENHI Grant Number JP16H06538

History