Rev-rec Source - Code Reviews Dataset (SEAA2018)
datasetposted on 08.06.2018 by Jakub Lipčák, Bruno Rossi
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
This dataset contains source code reviews of 51 projects mined from Gerrit (14 projects, ~133K pull requests) and GitHub (37 projects, ~159K pull requests). The dataset has been used in the upcoming article:
Lipcak, J., Rossi, B. (2018). A Large-Scale Study on Source Code Reviewer Recommendation, in 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA) 2018, IEEE.
Included files: 51 mined JSON files (zip), summary project list with descriptive statistics (pdf), readme file (md).