Main Dataset for "Evolution of Popular Music: USA 1960–2010"

2015-04-06T20:15:48Z (GMT) by Matthias Mauch

This is a large file (~20MB) called EvolutionPopUSA_MainData.csv, in comma-separated data format with column headers. Each row corresponds to a recording. The file is viewable in any text editor, and can also be opened in Excel or imported to other data processing programs.

Below is a list of the column headers, with annotations.

unique ID of the recording

name of the recording artist

artist name all upper case, no spaces, with secondary artists ("featuring") removed.

name of the track, i.e. usually name of the song

date of the first entry into the Billboard Hot 100

quarter, year, fiveyear, decade
transformations of first_entry to coarser time periods

era the track belongs to (1,...,4), as determined by Foote segmentation on the PC data (see below)

cluster membership of the track, as derived by k-means clustering on the PC data (see below)

hTopic_01, ... , hTopic_08
harmonic Topic weights, see description in the paper

tTopic_01, ... , tTopic_08
timbral Topic weights, see description in the paper

PC1, ... , PC14
principal components of the harmonic and timbral Topics

193 columns of chord change counts; the chord change is indicated in the column label (e.g. harm_M.2.M means major chord followed by another major chord 2 semitones up).

timb_01, ... , timb_35
35 columns of timbre class counts (see description in supplementary information)