%0 Generic
%A Redi, Miriam
%A Morgan, Jonathan
%A Taraborelli, Dario
%D 2019
%T Citation Reason Dataset
%U https://figshare.com/articles/dataset/Citation_Reason_Dataset/7756226
%R 10.6084/m9.figshare.7756226.v1
%2 https://ndownloader.figshare.com/files/14441312
%K Wikipedia articles
%K citations
%K citation needed
%K dataset
%K crowdsourcing
%K mechanical turk
%K Computer-Human Interaction
%K Knowledge Representation and Machine Learning
%X
A dataset of ~4K statements from English Wikipedia annotated with the reason why they need a citation.
Each line of this tab-separated file contains:
* entity_id: the Wikidata ID corresponding to the page
* revision_id: the revision of the corresponding Wikipedia article
* timestamp: the timestamp of the revision
* entity_title: the page/Wikidata ID title
* section_id: the section ID where the statement is
* section: the section title
* prg_idx: the index of the paragraph in the page
* sentence_idx: the index of the statement in the paragraph
* statement: the statement text
* citations: the source cited in the statement:
* vote1: first Mechanical Turk judgment
* vote2: second Mechanical Turk judgment
* vote3: third Mechanical Turk judgment
The numbers in the last 3 fields correspond to the following citation reasons:
1='direct quotation'
2='statistics'
3='controversial'
4='opinion'
5='life'
6='scientific'
7='historical'
8='other'
%I figshare