Specimens as Research Objects: Reconciliation across Distributed Repositories to Enable Metadata Propagation

<div>Slides from presentation at IEEE eScience 2018. Links to access the full paper given in the references section.</div><div><br></div><div><b>Abstract</b></div>Botanical specimens are shared as long-term consultable research objects in a global network of specimen repositories. Multiple specimens are generated from a shared field collection event; generated specimens are then managed individually in separate repositories and independently augmented with research and management metadata which could be propagated to their duplicate peers. Establishing a data-derived network for metadata propagation will enable the reconciliation of closely related specimens which are currently dispersed, unconnected and managed independently. Following a data mining exercise applied to an aggregated dataset of 19,827,998 specimen records from 292 separate specimen repositories, 36% or 7,102,710 specimens are assessed to participate in duplication relationships, allowing the propagation of metadata among the participants in these relationships, totalling: 93,044 type citations, 1,121,865 georeferences, 1,097,168 images and 2,191,179 scientific name determinations. The results enable the creation of networks to identify which repositories could work in collaboration. Some classes of annotation (particularly those regarding scientific name determinations) represent units of scientific work: appropriate management of this data would allow the accumulation of scholarly credit to individual researchers: potential further work in this area is discussed.<div><br></div>