10.6084/m9.figshare.7751027.v1 Miriam Redi Miriam Redi Jonathan Morgan Jonathan Morgan Dario Taraborelli Dario Taraborelli Summaries of Policies and Rules for Adding Citations to Wikipedia figshare 2019 Wikipedia pages citations Machine Learning Techniques Clustering techniques Word Embeddings Knowledge Representation and Machine Learning Library and Information Studies 2019-02-22 07:58:32 Dataset https://figshare.com/articles/dataset/Summaries_of_Policies_and_Rules_for_Adding_Citations_to_Wikipedia/7751027 <div>This fileset contains supplementary material for the paper "Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's Verifiability".</div><div><br></div><div>We share here 2 files related to the qualitative analysis of the reasons why editors add citations to Wikipedia.</div><div><b><br></b></div><div><b>Citation Needed Policy Summary</b> (Citation_Needed_Policy_Summary.pdf)</div><div>We did a qualitative analysis of the various policies that editors of English, Italian, and French Wikipedia follow when adding (or not adding) inline citations, we categorized them into macro-classes, and summarized in this docuemnt. </div><div><br></div><div><br></div><div><b>Citation Needed Reason Clusters </b>(Citation_Needed_Reason_Clusters.pdf)</div><div>When adding the {citation needed} template, editors also have the option to specify a reason via a free-form text field. We extracted the text of this field from more than 200,000 citation needed tags added by English Wikipedia editors and converted it into a numerical feature using Fasttext [1], then clustered them. Each cluster contains groups of consistent reasons why editors requested a citation. </div><div><br></div><div>[1] https://fasttext.cc/</div>