datasetposted on 20.11.2018 by Yiqing Hua
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
WikiConv is corpus encompassing the full history of conversations on Wikipedia.
The dataset and reconstruction process for this corpus has been published in the paper WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community, presented at EMNLP 2018.
The work has also been presented at the June 2018 Wikipedia reasearch showcase (the first half describes our work, using an earlier version of this dataset to predict conversations going awry.