Replication package of the paper: "On the Rise of Modern Software Documentation"
This is an anonymized version of the DwarvenMail project repository and datasets as replication package for the paper "On the Rise of Modern Software Documentation".
Modules
`annotator` module includes the source code of the manual annotator web application.
`dwarvenmail` module includes the source code of the scraper application.
Data
`annotator/data` includes tags found with manual annotation.
`binary_models` includes the latest version of the fully scraped model as a pickled object.
`charts` include all the different charts used in the paper and some more.
`data` includes the two input datasets as taken from GHSearch (preliminary and final)
`export` includes different files of exported stats in csv format.
`manual_annotation` includes output files from the manual annotation process (also used by the annotator module)
`models` is a placeholder folder for the expanded version of the scraped model if the multiprocess scraper is run with at least one valid GitHub API token.
Further Information
Top-level README.md includes additional information on the scraper and the annotator, as well as installation and usage instructions.