figshare
Browse
1/2
30 files

Wikipedia Talk Corpus

Version 3 2017-01-17, 21:50
Version 2 2016-12-13, 20:27
Version 1 2016-12-03, 17:58
dataset
posted on 2017-01-17, 21:50 authored by Ellery WulczynEllery Wulczyn, Nithum ThainNithum Thain, Lucas DixonLucas Dixon
We provide a corpus of discussion comments from English Wikipedia talk pages. Comments are grouped into different files by year. Comments are generated by computing diffs over the full revision history and extracting the content added for each revision. See our wiki for documentation of the schema and our research paper for documentation on the data collection and processing methodology.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC