figshare
Browse
1/1
3 files

Wikipedia Talk Labels: Aggression

Version 5 2017-02-22, 18:50
Version 4 2017-01-18, 23:31
Version 3 2017-01-17, 22:49
Version 2 2016-12-13, 20:25
Version 1 2016-12-03, 03:27
dataset
posted on 2017-02-22, 18:50 authored by Ellery WulczynEllery Wulczyn, Nithum ThainNithum Thain, Lucas DixonLucas Dixon
This data set includes over 100k labeled discussion comments from English Wikipedia. Each comment was labeled by multiple annotators via Crowdflower on whether it has aggressive tone. We also include some demographic data for each crowd-worker.  See our wiki for documentation of the schema of each file and our research paper for documentation on the data collection and modeling methodology. For a quick demo of how to use the data for model building and analysis, check out this ipython notebook.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC