figshare
Browse
sac2020_klp_micro_influencer.zip (375.12 MB)

SAC 2020 KLP micro-influencer dataset

Download (375.12 MB)
dataset
posted on 2019-12-04, 14:59 authored by Simone LeonardiSimone Leonardi, Diego MontiDiego Monti, Giuseppe Rizzo, Maurizio Morisio
This dataset contains Five Factor Model and Basic Human Values scores computed from text. This dataset also contains Gold Stardard defined for micro-influencers. Gold Standard labels are in each "y" subfolder inside "scores". For ease of use there is one folder called "allinone" where all data from all topics are stored in a single csv file. Each folder contains data obtained searching the hashtag used as folder name on Twitter. Data were retrieved from September 2018 to March 2019.

"candidates" folder contains the list of Twitter users considered in the analysis, they had from 1k to 100k followers at the time of the analysis and they were posting tweets with the folder name hashtag at the time we performed the experiment.

"followers" contains as many file as users in candidates folder, each file contains the associated candidate followers the time we performed the experiment.

"tweets" contains as many file as users in candidates folder, each file contains the associated candidate tweets (whole history) the time we performed the experiment. Due to privacy concerns just tweet ids are stored. You need to query Twitter to retrieve the associated text.

"scores" contains both scores compute by us using our algorithms and micro-influencer labels coming from our gold standard (in subfolder "y").

For further details look at the paper "Mining Micro-Influencers from Social Media Posts" in KLP SAC2020 proceedings, or open an issue in the GitHub repository.

History