WikiConv - English

WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community



This directory contains the WikiConv Corpus which encompasses the full history of conversations on Wikipedia Talk Pages.


The project webpage for this work is at: https://github.com/conversationai/wikidetox/tree/master/wikiconv


The dataset and reconstruction process for this corpus has been published in the paper [WikiConv: A Corpus of the Complete Conversational History of a Large Online

Collaborative Community](https://arxiv.org/abs/1810.13181), presented at [EMNLP 2018](http://EMNLP2018.org).


The work has also been presented at [the June 2018 Wikipedia research

showcase](https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#June_2018) (the first half describes our work, using an earlier version of this dataset to predict [conversations going awry](https://arxiv.org/abs/1805.05345).


The meta-data in this corpus is governed by the [CC0 license v1.0](http://creativecommons.org/publicdomain/zero/1.0/), and the content of the comments is governed by the [CC-SA license v3.0](https://creativecommons.org/licenses/by-sa/3.0/).