What is this?
This is the Gerrit code review datasets as part of the MSR 2016 data track paper "Mining the Modern Code Review Repositories: A Dataset of People, Process and Product". We extracted all the dataset from Gerrit server using our mining scripts which based on the official REST API. The data is stored as in relational database (MySQL) format.
Target Projects
We exported the database objects to self-contained files for each project. All .sql files are available here:
- OpenStack
- LibreOffice
- rar, last updated 2016/11/17 (new DB) rar, 7z, last updated 2015/05/30
- AOSP
- Qt (WARNING: this dataset is imcomplete since an issue of Qt official Gerrit API)
- Eclipse
- rar, last updated 2016/11/17 (new DB) rar, 7z, last updated 2015/06/01
- GerritHub
-
rar, 7z, last updated 2015/04/23
-
rar, last updated 2016/11/17 (new DB)
@InProceedings{Yang2016MSR,Notice: To protect the privacy of developers, we have anonymized all the usernames and email address of developers.
Author = {Yang, Xin and Kula, Raula Gaikovina and Yoshida, Norihiro and Iida, Hajimu},
Title = {Mining the Modern Code Review Repositories: A Dataset of People, Process and Product},
BookTitle = {Proceedings of the 13th International Conference on Mining Software Repositories},
Pages = {460--463},
Year = {2016}
}
Documentation
In our wiki pages, you can find the details of the database schema, , how to query from it using SQL, and how to obtain the source code.
Mining Scripts
The mining scripts can be found here, you can run/modify them to get your own dataset.
Active Members
- Xin Yang, Research Fellow (Osaka University, Japan)
- Raula Gaikovina Kula, Assistant professor (Osaka University, Japan)
- Norihiro Yoshida, Associate professor (Nagoya University, Japan)
- Hajimu Iida, Professor (NAIST, Japan)
Current Work
- List of Publications
- Research Home (Outdated since 2014)
Contacts
If you have any questions, please contact us (xinyang [at] ist.osaka-u.ac.jp).