Citations with contexts in Wikipedia
Aaron Halfaker
Meen Chul Kim
Andrea Forte
Dario Taraborelli
10.6084/m9.figshare.5588842.v1
https://figshare.com/articles/dataset/Citations_with_contexts_in_Wikipedia/5588842
<div>This dataset represents <b>structured metadata and contextual information about references added to Wikipedia articles</b> in a JSON format. </div><div><br></div><div>Each record represents an individual Wikipedia article revision with all the tags parsed, as stored in Wikipedia's XML dumps, including information about: </div><div><br></div><div>1) the context(s) in which the reference occurs within the article – such as the surrounding text, parent section title, and section level – </div><div><br></div><div>2) structured data and bibliographic metadata included within the reference itself (such as: any citation template used, external links, any known persistent identifiers) </div><div><br></div><div>3) additional data/metadata about the reference itself (the reference name, its raw content, and if applicable, revision ID associated with reference addition/deletion/change)</div><div><br></div><div>The data is available as a set of compressed JSON files, extracted from the July 1, 2017 XML dump of English Wikipedia. Other languages may be added to this dataset in the future.</div><div><br></div><div>The JSON schema and Python parsing libraries used to generate the data are in the references.</div>
2017-12-01 22:36:31
Wikipedia
Citations
References
bibliography data
altmetrics
text processing
XML
JSON
Library and Information Studies
Computer-Human Interaction
Digital Humanities
Educational Technology and Computing