A Large Scale Study of Wikipedia Users' Perceived Quality of Experience

This set of data is the result of a joint Web QoE work conducted by the Telecom Paris researchers and the Wikimedia Foundation (https://meta.wikimedia.org/wiki/Research:Study_of_performance_perception_on_Wikimedia_projects, https://webqoe.telecom-paristech.fr/ ).

Both the datasets contain the user answers to the survey question appeared to the French and Russian Wikipedias during the time span ranging from 25/05/2018 to 15/10/2018. The first file ("wikiqoe_datetime.csv") contains three columns each one indicating respectively for every record:

- the wiki of the page for which the user answer to the survey has been collected (either French or Russian),

- the time rounded to the hour (in order to prevent user deanonymization) as a datetime object,

- the survey response provided by the user (positive = 1, negative = -1, neutral = 0).


The second file ("wikiqoe_public_available_features.csv") contains 20 columns each one indicating respectively for every record:

- the wiki from which the request was issued (ruwiki or frwiki),

- 18 performance metrics (e.g., fetchStart, domInteractive, etc.),

- the survey response provided by the user (positive = 1, negative = -1, neutral = 0).


Note that the two dataset comprise the same set of users, but, for the second one, the time information is not provided, in order to avoid potential deanonymization using other (unknown) dataset.


The Wikimedia legal team has given clearance for the publication of these datasets, after having fully prevented user deanonymization and content-linkability.


More information and details regarding methodology and results can be found in the papers and in the technical report stored here: https://webqoe.telecom-paristech.fr/

If you use these datasets in your research, you can reference to the appropriate paper(s):
1)
@inproceedings{salutari19www,

title = {A large-scale study of Wikipedia users’ quality of experience},

author = {Salutari, Flavia and Hora, Diego Da and Dubuc, Gilles and Rossi, Dario},

booktitle = {In proceedings of the 30th Web Conference (WWW'19)},

address = {San Francisco, CA, USA},

month = may,

year = {2019},

howpublished = {https://nonsns.github.io/paper/rossi19www.pdf}

}