Browse (1.14 GB)

Replication package of the paper: "On the Rise of Modern Software Documentation"

Download (1.14 GB) This item is shared privately
modified on 2023-06-26, 13:54

This is an anonymized version of the DwarvenMail project repository and datasets as replication package for the paper "On the Rise of Modern Software Documentation".


`annotator` module includes the source code of the manual annotator web application.

`dwarvenmail` module includes the source code of the scraper application.


`annotator/data` includes tags found with manual annotation.

`binary_models` includes the latest version of the fully scraped model as a pickled object.

`charts` include all the different charts used in the paper and some more.

`data` includes the two input datasets as taken from GHSearch (preliminary and final)

`export` includes different files of exported stats in csv format.

`manual_annotation` includes output files from the manual annotation process (also used by the annotator module)

`models` is a placeholder folder for the expanded version of the scraped model if the multiprocess scraper is run with at least one valid GitHub API token.

Further Information

Top-level includes additional information on the scraper and the annotator, as well as installation and usage instructions.


“INSTINCT” (SNF Project No. 190113)