Redi, Miriam Morgan, Jonathan Taraborelli, Dario Citation Reason Dataset <div><b>A dataset of ~4K statements from English Wikipedia annotated with the reason why they need a citation.</b></div><div><b><br></b></div><div>Each line of this tab-separated file contains:</div><div>* <i>entity_id: </i> the Wikidata ID corresponding to the page</div><div>* <i>revision_id</i>: the revision of the corresponding Wikipedia article </div><div><i>* t</i><i>imestamp: </i>the timestamp of the revision</div><div>* <i>entity_title: </i>the page/Wikidata ID title </div><div><i>* section_id: </i>the section ID where the statement is </div><div><i>* section: </i>the section title </div><div>* <i>prg_idx: </i>the index of the paragraph in the page</div><div>* <i>sentence_idx: </i> the index of the statement in the paragraph</div><div>* <i>statement: </i>the statement text</div><div><i>* citations</i>: the source cited in the statement:</div><div><i>* vote1: </i>first Mechanical Turk judgment</div><div><i>* vote2</i>: second Mechanical Turk judgment</div><div><i>* vote3: </i>third Mechanical Turk judgment</div><div><br></div><div>The numbers in the last 3 fields correspond to the following citation reasons:</div><div><div><div>1='direct quotation'</div><div>2='statistics'</div><div>3='controversial'</div><div>4='opinion'</div><div>5='life'</div><div>6='scientific'</div><div>7='historical'</div><div>8='other'</div></div></div> Wikipedia articles;citations;citation needed;dataset;crowdsourcing;mechanical turk;Computer-Human Interaction;Knowledge Representation and Machine Learning 2019-02-22
    https://figshare.com/articles/dataset/Citation_Reason_Dataset/7756226
10.6084/m9.figshare.7756226.v1