If you use the data in this collection for your research or analyses, cite the following papers:
- Pappalardo et al., (2019) A public data set of spatio-temporal match events in soccer competitions, Nature Scientific Data 6:236, https://www.nature.com/articles/s41597-019-0247-7
- Pappalardo et al. (2019) PlayeRank: Data-driven Performance Evaluation and Player Ranking in Soccer via a Machine Learning Approach.
ACM Transactions on Intellingent Systems and Technologies (TIST) 10, 5,
Article 59 (September 2019), 27 pages. DOI:
https://doi.org/10.1145/3343172
Soccer analytics is attracting an increasing interest of academia and industry, thanks to the availability of sensing technologies that provide high-fidelity data streams extracted from every match. Unfortunately, these detailed data are owned by specialized companies and hence are rarely publicly available for scientific research. To fill this gap, we provide to the public the largest open collection of soccer-logs ever released, collected by Wyscout (https://wyscout.com/) containing all the spatio-temporal events (passes, shots, fouls, etc.) that occur during all matches of an entire season of seven competitions (La Liga, Serie A, Bundesliga, Premier League, Ligue 1, FIFA World Cup 2018, UEFA Euro Cup 2016). A match event contains information about its position, time, outcome, player and characteristics. This dataset has been used recently during the Soccer Data Challenge (https://sobigdata-soccerchallenge.it/) and, to the best of our knowledge, it is the largest public collection