PHEME dataset of rumours and non-rumours

<div>This dataset contains a collection of Twitter rumours and non-rumours posted during breaking news. The five breaking news provided with the dataset are as follows:</div><div><br></div><div>* Charlie Hebdo: 458 rumours (22.0%) and 1,621 non-rumours (78.0%).</div><div>* Ferguson: 284 rumours (24.8%) and 859 non-rumours (75.2%).</div><div>* Germanwings Crash: 238 rumours (50.7%) and 231 non-rumours (49.3%).</div><div>* Ottawa Shooting: 470 rumours (52.8%) and 420 non-rumours (47.2%).</div><div>* Sydney Siege: 522 rumours (42.8%) and 699 non-rumours (57.2%).</div><div><br></div><div>The data is structured as follows. Each event has a directory, with two subfolders, rumours and non-rumours. These two folders have folders named with a tweet ID. The tweet itself can be found on the 'source-tweet' directory of the tweet in question, and the directory 'reactions' has the set of tweets responding to that source tweet.</div><div><br></div><div>This dataset was used in the paper 'Learning Reporting Dynamics during Breaking News for Rumour Detection in Social Media' for rumour detection. For more details, please refer to the paper.</div><div><br></div><div>License: The annotations are provided under a CC-BY license, while Twitter retains the ownership and rights of the content of the tweets.</div>