WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community

Published on (GMT) by Lucas Dixon
The WikiConv Corpus encompasses the full history of conversations on Wikipedia Talk Pages. The project webpage for this work is at: https://github.com/conversationai/wikidetox/tree/master/wikiconv The dataset and reconstruction process for this corpus has been published in the paper [WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community](https://arxiv.org/abs/1810.13181), presented at [EMNLP 2018](http://EMNLP2018.org). The work has also been presented at [the June 2018 Wikipedia reasearch showcase](https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#June_2018) (the first half describes our work, using an earlier version of this dataset to predict [conversations going awry](https://arxiv.org/abs/1805.05345). The meta-data in this corpus is goverened by the [CC0 license v1.0](http://creativecommons.org/publicdomain/zero/1.0/), and the content of the comments is goverened by the [CC-SA license v3.0](https://creativecommons.org/licenses/by-sa/3.0/).
CITE ITEMS FROM THIS PROJECT
cite all items