figshare
Browse
Wikimedia Research Showcase July 2023.pdf (4.01 MB)

Multilingual approaches to support knowledge integrity in Wikipedia - Wikimedia Research Showcase - July 2023

Download (4.01 MB)
presentation
posted on 2023-07-20, 09:50 authored by Pablo AragónPablo Aragón, Diego Saez-TrumperDiego Saez-Trumper

Slides from the July 2023 Wikimedia Research showcase on Improving knowledge integrity in Wikimedia projects.


Multilingual approaches to support knowledge integrity in Wikipedia
Knowledge integrity in Wikipedia is key to ensure the quality and reliability of information. For that reason, editors devote a substantial amount of their time in patrolling tasks in order to detect low-quality or misleading content. In this talk we will cover recent multilingual approaches to support knowledge integrity. First, we will present a novel design of a system aimed at assisting the Wikipedia communities in addressing vandalism. This system was built by collecting a massive dataset of multiple languages and then applying advanced filtering and feature engineering techniques, including multilingual masked language modeling to build the training dataset from human-generated data. Second, we will showcase the Wikipedia Knowledge Integrity Risk Observatory, a dashboard that relies on a language-agnostic version of the former system to monitor high risk content in hundreds of Wikipedia language editions. We will conclude with a discussion of different challenges to be addressed in future work.

History