10.1184/R1/5549281.v1
John Kitchin
John
Kitchin
Ana Van Gulick
Ana
Van Gulick
Lisa Zilinski
Lisa
Zilinski
Automating Data Sharing Through Authoring Tools
Carnegie Mellon University
2017
data sharing
embedding
org-mode
authoring
2017-10-29 16:27:32
Dataset
https://kilthub.cmu.edu/articles/dataset/Automating_Data_Sharing_Through_Authoring_Tools/5549281
In the current scientific publishing landscape, there is a need for an authoring workflow that easily integrates data and code into manuscripts and that enables the data and code to be published in reusable form. Automated embedding of data and code into published output will enable superior communication and data archiving. In this work, we demonstrate a proof of concept for a workflow, org-mode, which successfully provides this authoring capability and workflow integration. We illustrate this concept in a series of examples for potential uses of this workflow. First, we use data on citation counts to compute the h-index of an author, and show two code examples for calculating the h-index. The source for each example is automatically embedded in the PDF during the export of the document. We demonstrate how data can be embedded in image files, which themselves are embedded in the document. Finally, metadata about the embedded files can be automatically included in the exported PDF, and accessed by computer programs. In our customized export, we embedded metadata about the attached files in the PDF in an Info field. A computer program could parse this output to get a list of embedded files and carry out analyses on them. Authoring tools such as Emacs + org-mode can greatly facilitate the integration of data and code into technical writing. These tools can also automate the embedding of data into document formats intended for consumption.