10.6084/m9.figshare.7751027.v1
Miriam Redi
Miriam
Redi
Jonathan Morgan
Jonathan
Morgan
Dario Taraborelli
Dario
Taraborelli
Summaries of Policies and Rules for Adding Citations to Wikipedia
figshare
2019
Wikipedia pages
citations
Machine Learning Techniques
Clustering techniques
Word Embeddings
Knowledge Representation and Machine Learning
Library and Information Studies
2019-02-22 07:58:32
Dataset
https://figshare.com/articles/dataset/Summaries_of_Policies_and_Rules_for_Adding_Citations_to_Wikipedia/7751027
<div>This fileset contains supplementary material for the paper "Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's Verifiability".</div><div><br></div><div>We share here 2 files related to the qualitative analysis of the reasons why editors add citations to Wikipedia.</div><div><b><br></b></div><div><b>Citation Needed Policy Summary</b> (Citation_Needed_Policy_Summary.pdf)</div><div>We did a qualitative analysis of the various policies that editors of English, Italian, and French Wikipedia follow when adding (or not adding) inline citations, we categorized them into macro-classes, and summarized in this docuemnt. </div><div><br></div><div><br></div><div><b>Citation Needed Reason Clusters </b>(Citation_Needed_Reason_Clusters.pdf)</div><div>When adding the {citation needed} template, editors also have the option to specify a reason via a free-form text field. We extracted the text of this field from more than 200,000 citation needed tags added by English Wikipedia editors and converted it into a numerical feature using Fasttext [1], then clustered them. Each cluster contains groups of consistent reasons why editors requested a citation. </div><div><br></div><div>[1] https://fasttext.cc/</div>