figshare
Browse
1/1
4 files

Image and sound data from film Fantasia produced by Walt Disney

Version 3 2018-03-19, 11:22
Version 2 2018-03-19, 11:00
Version 1 2018-03-19, 10:01
dataset
posted on 2018-03-19, 11:22 authored by Lucía Martín-GómezLucía Martín-Gómez, Javier Pérez-Marcos
This repository contains the data used in the article Convolutional neural networks and transfer learning applied to automatic composition of descriptive music published in the 15th International Conference on Distributed Computing and Artificial Intelligence (DCAI). Data structure is explained in detail in the article. This proposal is the continuation of an earlier work whose data are available in a GitHub repository.

Abstract
Visual and musical arts has been strongly interconnected throughout history. The aim of this work is to compose music on the basis of the visual characteristics of a video. For this purpose, descriptive music is used as a link between image and sound and a video fragment of film Fantasia is deeply analyzed. Specially, convolutional neural networks in combination with transfer learning are applied in the process of extracting image descriptors. In order to establish a relationship between the visual and musical information, Naive Bayes, Support Vector Machine and Random Forest classifiers are applied. The obtained model is subsequently employed to compose descriptive music from a new video. The results of this proposal are compared with those of an antecedent work in order to evaluate the performance of the classifiers and the quality of the descriptive musical composition.

DATA
  1. train_data.arff: Image descriptors and the most important sound of each frame from the fragment "The Nutcracker Suite" in film Fantasia obtained by means of CNNs. Data stored into ARFF format.
  2. test_data.arff: Image descriptors of each frame from the fragment "The Firebird" in film Fantasia 2000 obtained by means of CNNs. Data stored into ARFF format.
  3. midi.csv: Frame number of the fragment "The Firebird" in film Fantasia 2000 and the sound predicted by the system encoded in MIDI. Data stored into CSV format.
  4. firebird_prediction.mp3: Audio file with the synthesizing of the prediction data for the fragment "The Firebird" of film Fantasia 2000.
LICENSE
Data is available under MIT License. To make use of the data the article must be cited.

History