figshare
Browse
A1-2 Magic research Articles eResearch NZ Mark Gahegan Ellerm As Given.pdf (5.92 MB)

“Magically” creating and updating research articles from experiments

Download (5.92 MB)
presentation
posted on 2024-03-04, 09:41 authored by eRNZ AdmineRNZ Admin, Mark GaheganMark Gahegan, Ben Adams, Gus Ellerm

When all steps in the process of analysis—from data discovery and integration, to method development and revision, to application in a given domain, to final publication—are made reproducible and transparent, we close the gap between doing research and communicating research. However, it is estimated that up to 80% of published research cannot be reproduced or reused because of missing, incorrect or inaccessible information. This is shocking, bearing in mind that much research is now conducted almost entirely in-silico.

In theory we should be able to capture and manage all the relevant details because they all exist somewhere on our servers! Traditional research journals are fossilized objects that often contain errors, ambiguities and are completely disconnected from the originating scientific process. They are often out of date before they appear in print, or become so quickly when new datasets or better methods are developed. By leveraging integrations, not before possible, between maturing eScience technologies, we are able to approach publication in more innovative ways. What if research articles could be authored differently? What if writing the article caused the experiments to be conducted? What if conducting the experiments caused the article to be written? Both of these approaches are fast becoming possible.

Our talk here considers the second of these questions: automatically creating and updating research articles from experiments. Using Globus and Gladier (https://github.com/globus-gladier/gladier) in collaboration with Argonne Labs, we have created a workflow system that supports reactive, dependency-based computations to facilitate truth maintenance when changes are made to the data or methods. We have containerised these workflows using the Research Object Crate specification (RO-Crate: https://www.researchobject.org/ro-crate/). A series of nested RO-Crates contain descriptions of the experiment at progressive levels of abstraction/detail. The highest RO-Crate container is not the entire workflow, but the research article itself. The article is thus fundamentally bound to the experiment and in large part created from it. The article then becomes the final abstraction of the experiment into the familiar form of a published paper.

The talk will address two questions:

1. How can we link together all of the steps in an analysis workflow so that an experiment remains live, that is: reactive to changes in both data and methods? If we can achieve this, it becomes possible to write papers that can update themselves when better data and methods become available. Think for example of research that describes the state of a pandemic, or the impacts of climate change on coastal communities. Such papers could in theory stay upto-date and relevant even after publication because they are dynamically created from these changing resources.

2. To what extent can a research article be written and updated from such a workflow? Figures, tables and code are relatively easy to update, but what about descriptions of code, of methods, of data? What about a literature review, results and conclusions…can they also be updated? We will provide details of our architecture, examples of its use and show a case study of a live research article that is created and maintained using these ideas.

ABOUT THE AUTHORS
Mark Gahegan is Professor in Computer Science at the University of Auckland, where he also directs the Centre for eResearch. He is PI of ‘Beyond Prediction…’, a large, 7-year Data Science Programme Grant from MBIE. His research interests are in eScience, GIScience, Data Science and all points in between.


Gus Ellerm is a PhD student in Computer Science at the University of Canterbury, studying research workflows and their role in supporting live publications. Gus leads the implementation of the work reported here, is funded via the above MBIE grant and is supervised by Ben Adams and Mark Gahegan. Gus has recently presented his work at the IEEE eScience’23 conference.

Ben Adams is Associate Professor of Computer Science and Software Engineering at the University of Canterbury. His research interests revolve around new ways to use computing technology to help advance human understanding of our environment and world, drawing from data science, spatial science and cognitive science.

For more information about eResearch NZ / eRangahau Aotearoa, visit:
https://eresearchnz.co.nz/

History

Usage metrics

    eResearch NZ

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC