known_paid_editors.201803.tsv (64.29 kB)
Known Undisclosed Paid Editors (English Wikipedia)
dataset
posted on 2018-04-24, 14:17 authored by TonyBallioni, James Heilman, Brian Henry, Aaron HalfakerAaron HalfakerThis dataset contains a manually curated set of known undisclosed paid editor (UPE) accounts from Wikipedia. This is not a complete set of known editors. Editors who do not appear in this set are not guaranteed to not be paid editors.
See also https://en.wikipedia.org/wiki/Wikipedia:Paid-contribution_disclosure
The dataset contains four columns:
- user_name: The username of the UPE
- case_page_name: The page name (title) of a page describing the case through which paid editing was discovered.
- type: One of three types of UPEs (described below)
- notes: Any notes that a dataset curator chose to include with the example.
Type 1 | User makes just over 10 minor edits. Is quiet for a few days well waiting for autoconfirm (user right) to kick in (takes 4 days). Then creates a promotional article in one big edit followed by the account going silent. | This is the main priority. These are present in the largest numbers and are the clearest pattern. They also cause the most damage to our shared brand. |
Type 2 | User is an obvious newbie. Makes lots of mistakes. Often turns out to be internal staff. | Not a key priority. We already manage these cases fairly well as they are often so obvious. |
Type 3 | Undisclosed paid editor, but one who only moves on to new accounts once their current account gets detected. | A serious problem--these will be harder to detect as we will have smaller numbers of these cases. Also a long time will need to pass before a pattern becomes present |