Wikipedia Article Feedback corpus
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
This dataset contains the entire corpus of feedback submitted on the English, French and German Wikipedia during the Article Feedback v.5 pilot (AFT). The Article Feedback pilot ran for a year between March 2013 and March 2014. During the pilot, 1,549,842 feedback messages were collected across the three languages.
All feedback messages and their metadata (as described in this schema) are available in this dataset, with the exception of messages oversighted and/or deleted by the end of the pilot.
The corpus is released under the following license:
- CC BY SA 3.0 for feedback messages
- CC0 for the associated metadata
Results from the pilot are discussed in: Halfaker, A., Keyes, O. and Taraborelli, D (2013). Making peripheral participation legitimate: Reader engagement experiments in Wikipedia. CSCW '13 Proceedings of the 2013 Conference on Computer Supported Cooperative Work DOI:10.1145/2441776.2441872