figshare and Monash University: combining cloud management and discoverability with institutional storage

Groenewegen, David; Hahnel, Mark

doi:10.6084/m9.figshare.1224755.v1

figshare and Monash University.pptx (32.73 MB)

figshare and Monash University: combining cloud management and discoverability with institutional storage

presentation

posted on 2014-11-04, 02:50 authored by David GroenewegenDavid Groenewegen, Mark HahnelMark Hahnel

Management, publication and collaboration around research data is key to the development of scholarly communication and is quickly becoming a requirement as more and more funders see the benefits of open research. However, effective data management is difficult due to the heterogeneity of use cases and is often restricted further by the limitations of research data management software. Technology for data management is always evolving and is helping researchers to communicate both positive and negative research results more openly, share data that has previously been locked away on desktops, and develop research objects as a publishing output in their own right. figshare is at the forefront of developing software for research data management and collaboration and the drive towards open data as well as being one of a myriad of new technology companies operating in the research area. With many years experience in research data management Monash University is seeking to balance the efficiencies that come from a well-managed portfolio of standardised tools with researchers’ requirements for flexible and innovative solutions that support research, and which do not force a ‘one size fits all’ approach.

For several years Monash University has been employing the Data Curation Continuum [1,2] as a conceptual model to understand the nature of research data workflows, and to put in place infrastructure that will enhance and enable these workflows. The Data Curation Continuum separates the process of research data management into three distinct domains - Private, Shared and Public. These domains are separated by “Curation boundaries”, which are virtual decision points at which the creators of data decide what they will share, with whom, with what metadata and under what conditions. Traditionally, these boundaries have been crossed using some form of manual intervention - data was moved from one repository to another that is more open, or metadata was created only at the point where it was needed to cross the boundary. This process was generally time consuming and inefficient. Consequently, data only moved into the public domain when there was an exceptionally strong reason to do so, such as funder requirements or peer expectations [3]. Monash already offers some technical solutions making this easier, most notably myTardis [4]. However these solutions have primarily supported high-end users of complicated data creating instruments, such as the Australian Synchrotron.

Monash wanted a system supporting research data management that was more convenient and easier to use than existing practices to enable data curation for a wider range of researchers. The new system needed to make it harder to do the “wrong” thing than the right, and to be an obviously better way of working than what researchers were already doing. The system needed to help create metadata at points when it was relevant, make it easy the researcher to control sharing in a way that made sense to them, and would make publishing of data a one click process. And the system needed to offer the flexibility of the cloud with the confidence of ongoing storage within institutional storage.

Here we describe how figshare and Monash have been working together to combine cloud-based management and discoverability with institutional storage. The combined expertise has led to innovative and user friendly interfaces which help encourage the storage of data within a repository (the Private Domain), to enable controlled collaboration within that repository (the Shared Domain) and to make the data publishable and sharable (the Public Domain) in a streamlined and automated fashion, minimising the human intervention required to move across boundaries. These developments are the combination of the previous experience of the Monash research data management team in curating data and incentivising researchers to do the same, with the exploitation of new web native technology that the figshare interface allows.

The paper will demonstrate how this system works and the researcher workflows that it enables. The paper will also discuss the challenges of combining a cloud-based discovery and management system with locally hosted storage, including such issues as metadata standards, authentication / authorisation and branding.

REFERENCES

[1] Treloar A, Harboe-Ree C, “Data management and the curation continuum: how the Monash experience is informing repository relationships” Proceedings of VALA 2008

[2] Treloar A, Groenewegen D, Harboe-Ree C, “The data curation continuum: Managing data objects in institutional repositories” D-Lib Magazine 13 (9), 4

[3] Australian National Data Service, Curation Continuum, http://ands.org.au/guides/curation.continuum.html ,
accessed 7 February, 2014

[4] Androulakis S, Bertling P, Groenewegen D, Harrison A, “Breaking down the boundaries to storing, sharing and publishing research data” Open Repositories 2014