figshare
Browse
1/1
2 files

Supplementary Material for "Evaluating Integration Strategies for Visuo-Haptic Object Recognition"

Download all (677.78 MB) This item is shared privately
dataset
modified on 2021-02-22, 15:31
This directory contains the supplementary material for the following journal paper:

Toprak, S., Navarro-Guerrero, N., & Wermter, S. (2018). Evaluating Integration Strategies for Visuo-Haptic Object Recognition. Cognitive Computation, 10(3), 408–425. https://doi.org/10.1007/s12559-017-9536-7.

Data Collection

The visual and haptic data was collected from 11 every-day objects with a NAO robot. The collected data was used to compare three different integration strategies to integrating the information from the visual and haptic modalities, of which two are commonly used in existing work on visuo-haptic object recognition and the third one is the integration strategy that we propose and that is inspired by the brain.

The data was collected under two conditions. First, 10 observations were collected for every object in the object set under ideal lab conditions. Another 3 observations per object were collected under uncontrolled real-world conditions.

The data for each object observation is comprised of the following: The visual data comes in the form of two images from NAO's head camera. One image (background.png) shows the background of the scene, while the other (foreground.png) shows the same scene with the object in it. We used these images to extract the shape, color and texture features of the object. As for the haptic data, it was collected from two different sources: The kinesthetics.pickle file contains motor data retrieved from the robot's arms and stored in a Python dictionary object. The joint angles retrieved after having NAO fully grasp the object were used as shape information, whereas the electric currents measured after making NAO lift the object were additionally used as weight information. For the object texture and hardness, sound data was recorded with four contact microphones that the NAO was equipped with and that served as tactile sensors. For each of these object properties and for each microphone, there are two one-second sound snippets: noise.wav contains the background noise recorded right before NAO rubbed or hit the object against the table surface. The sound caused in the process was recorded and saved in noisy_signal.wav.

This raw data can be found in the raw_data.tar.gz archive.

Feature Extraction

The collected data was processed as follows to extract features: Hu moments, a flattened color histogram and a histogram of local binary patterns were used as feature descriptors for the visual object properties shape, color and texture. For both texture and hardness, the one-sided magnitude spectrum of the cleaned sound signal was computed.

Each of these features were stored in separate text-files. The total dimensionality of the extracted features is 44 949, with the dimensionality of each one being as follows:
- visual_shape.txt : 7
- color.txt : 768
- visual_texture.txt : 26
- haptic_shape.txt : 12
- weight.txt : 36
- haptic_texture.txt : 22 050
- hardness.txt : 22 050

Train-Test Split

The dataset contains a total of 143 observations. For each object, the ideal observations were split randomly in a 70:30 ratio and then added to the training and test sets, respectively. The observations made under real-world conditions were split and then added to the training and test sets in the same way as above. As such, there are 99 observations are part of the training set and the remaining 44 are in the test set.

The information for each object property (and for every sensor placement in the case of haptic texture and hardness) are contained in separate text (.txt) files in the training and test subfolders. The labels for the observations are also provided in a separate txt-file. They can be found in the dataset.tar.gz archive.

Contact

For more information, please refer to the paper. For specific questions regarding the paper, please contact Sibel Toprak.