Why the World Reads Wikipedia

posted on 19.08.2019, 07:57 by Florian Lemmerich, Diego Saez-Trumper, Robert West, Leila Zia
This project contains data for the paper:

Lemmerich, Florian, Diego Sáez-Trumper, Robert West, and Leila Zia. "Why the World Reads Wikipedia: Beyond English Speakers." Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (WSDM). ACM, 2019.

From the abstract:
As one of the Web's primary multilingual knowledge sources, Wikipedia is read by millions of people across the globe every day. Despite this global readership, little is known about why users read Wikipedia's various language editions. To bridge this gap, we conduct a comparative study by combining a large-scale survey of Wikipedia readers across 14 language editions with a log-based analysis of user activity. We proceed in three steps. First, we analyze the survey results to compare the prevalence of Wikipedia use cases across languages, discovering commonalities, but also substantial differences, among Wikipedia languages with respect to their usage. Second, we match survey responses to the respondents' traces in Wikipedia's server logs to characterize behavioral patterns associated with specific use cases, finding that distinctive patterns consistently mark certain use cases across language editions. Third, we show that certain Wikipedia use cases are more common in countries with certain socio-economic characteristics; e.g., in-depth reading of Wikipedia articles is substantially more common in countries with a low Human Development Index. These findings advance our understanding of reader motivations and behaviors across Wikipedia languages and have implications for Wikipedia editors and developers of Wikipedia and other Web technologies.

The data is described in the README as well as here on this meta page.

