Browse
Browse and Search
Search
TEXT
tweets.metadata
.json
(0.48 kB)
View file
This item contains files with download restrictions
.BSON
tweets
.bson
(1.92 GB)
View file
This item contains files with download restrictions
Next page
Previous page
1/1
Switch View
Switch between different file views
Thumbnail view
List view
File view
2 files
Fullscreen
Peacock Chinese Twitter Corpus (PCTC)
Cite
Download all
(1.92 GB)
Share
Embed
dataset
posted on 2020-12-26, 19:09
authored by
Xiaowen Nie
Xiaowen Nie
,
Weiyang Mo
The Peacock Chinese Twitter Corpus (PCTC) contains 4911813 tweets (including original tweets and replies, excluding retweets) made in simplified Chinese from 2007 to 2020. The documents are stored in MongoDB in JSON format.
User Interface: www.peacockpus.com
History
Usage metrics
Categories
Linguistics not elsewhere classified
Keywords
Chinese language
corpus
linguistic research
Linguistics
Licence
CC BY 4.0
Exports
Select an option
RefWorks
RefWorks
BibTeX
BibTeX
Ref. manager
Ref. manager
Endnote
Endnote
DataCite
DataCite
NLM
NLM
DC
DC