World Cup of Flags dataset
We introduce a new dataset, chronicling the World Cup of Flags, a competitive vexillology tournament held on Twitter. The dataset combines challenges arising from three angles. Firstly, the data is multi-relational, so analysis techniques need to be able to respect that; for instance, conclusions on prior probabilities must be drawn across one-to-many or many-to-many relations spanning several tables. Secondly, the data stems from a tournament composed of a group phase followed by a knockout phase; assessing performance of a specific competitor needs to incorporate the relative strength of the opponents gleaned from incomplete data: most flags will not meet most other flags in the tournament. Finally, this competition was held on Twitter; as a consequence it spiraled completely out of control. An auxiliary contribution of this paper is the downright bizarre story of precisely how the World Cup of Flags unfolded, including ideological differences between vexillological, maximalist, and nationalist voting blocs, a takeover by a substantial wave of Zimbabwean Twitter personalities, and involvement of both the Prime Minister and Leader of the Opposition of Trinidad & Tobago. Hence, the World Cup of Flags dataset is a publicly available benchmark of noisy data, concerning matches in a tournament structure that is familiar from many sports, also encompassing multi-relational data mining challenges.