3 files

Zika tweets and topics (2015-03-01 to 2016-10-31)

Download all (399.46 MB) This item is shared privately
modified on 2019-05-08, 23:07
This collection contains the identifiers and metadata of all tweets mentioning Zika ("zika", "zica", "zikv") from March 1, 2015 through October 31, 2016. The tweets can be retrieved using the Twitter API (

The tweets are provided in two files. The first file contains tweets in English, Spanish, and Portuguese. A polylingual topic model was applied to these tweets, so this file contains the topic probabilities for each tweet. The final 50 columns correspond to the topic probabilities of the 50 topics. The second file contains tweets in all other languages, which do not have columns with topic probabilities.

In addition to the tweet files, topic_topwords.txt shows the words associated with each of the 50 topics, output by MALLET. The first line is the topic identifier (zero-indexed), which you can link to the topic columns in the tweet files.