Tweets dataset on Zika virus
Comma Separated Values (CSV) text file containing (compressed) Tweets with mentions about the Zika Virus.
Number of rows: 2782022
Environment:
=> Debian GNU/Linux 9.9 (stretch) Release 9.9.
=> PostgreSQL 9.6.15 on x86_64-pc-linux-gnu.
The data was captured from October/2017 until March/2018 and stored on a PostgreSQL table with the following structure:
texto: text
data: timestamp without time zone
dados: jsonb /* The contents of the tweet */
The CSV file was created from the PostgreSQL table with the commands:
zika@zika:~$ psql
psql (9.6.15)
Digite "help" para ajuda.
zika=> use tweets
zika-> \copy (SELECT * FROM tweets) to '/home/zika/tweets.csv' with csv
The CSV file was compressed with the command:
zika@zika:~$ tar -cvzf tweets.tgz tweets.csv