Twitter Death Hoaxes dataset

2018-03-11T21:14:53Z (GMT) by Arkaitz Zubiaga
This is a dataset of death reports collected from Twitter between 1st January, 2012 and 31st December, 2014. It was collected by tracking the keyword 'RIP', and matching those tweets in which a name is mentioned next to RIP. Matching names were identified by using Wikidata as a database of names. For more details, please refer to the paper:<div>https://arxiv.org/abs/1801.07311<br><div><br></div><div>The dataset contains 4,007 death reports, of which 2,301 are real deaths, 1,092 are commemorations and 614 are fake deaths.</div><div><br></div><div>Along with this dataset, the word embeddings models used in this paper are also provided.</div><div><br></div><div><div>This dataset is released in accordance with Twitter's TOS, which allows sharing of tweet IDs and are intended for non-commercial research.</div><div><br></div><div>Note: Twitter's developer policy doesn't allow sharing more than 1,500,000 tweet IDs (https://dev.twitter.com/overview/terms/policy#updated-policy), unless the author is affiliated with an academic institution (which is my case) and tweet IDs are solely used for non-commercial purposes (https://twittercommunity.com/t/policy-update-clarification-research-use-cases/87566). Hence, by downloading these datasets you agree that you will not use it for commercial purposes.</div></div></div>