Citations with contexts in Wikipedia Aaron Halfaker Meen Chul Kim Andrea Forte Dario Taraborelli 10.6084/m9.figshare.5588842.v1 https://figshare.com/articles/dataset/Citations_with_contexts_in_Wikipedia/5588842 <div>This dataset represents <b>structured metadata and contextual information about references added to Wikipedia articles</b> in a JSON format. </div><div><br></div><div>Each record represents an individual Wikipedia article revision with all the tags parsed, as stored in Wikipedia's XML dumps, including information about: </div><div><br></div><div>1) the context(s) in which the reference occurs within the article – such as the surrounding text, parent section title, and section level – </div><div><br></div><div>2) structured data and bibliographic metadata included within the reference itself (such as: any citation template used, external links, any known persistent identifiers) </div><div><br></div><div>3) additional data/metadata about the reference itself (the reference name, its raw content, and if applicable, revision ID associated with reference addition/deletion/change)</div><div><br></div><div>The data is available as a set of compressed JSON files, extracted from the July 1, 2017 XML dump of English Wikipedia. Other languages may be added to this dataset in the future.</div><div><br></div><div>The JSON schema and Python parsing libraries used to generate the data are in the references.</div> 2017-12-01 22:36:31 Wikipedia Citations References bibliography data altmetrics text processing XML JSON Library and Information Studies Computer-Human Interaction Digital Humanities Educational Technology and Computing