Ingestion Workflow for CUHK Digital Repository on Islandora: Bringing to light the Chinese Rare Books and ETDs metadata in MARC

2017-06-29T07:37:40Z (GMT) by Louisa Lam Jeff Liu
The CUHK Digital Repository building on Islandora houses a multitude of digital collections including Chinese rare books and electronic theses and dissertations, which pose similar challenges in metadata extraction. First, they are catalogued in standard MARC with romanized Chinese characters in tag 245 while the readable Chinese characters are in tag 880. It is necessary to invert tag 880 to replace tag 245 for more friendly display for researchers. Second, many of these titles are multi-volume set that are catalogued in the form of multiple bibliographic records linking to one item record. This many-to-one structure generates complexity in the transformation from MARC to MODS. Third, these two collections have tens of thousands of volumes, using manual process to extract metadata is time-consuming. Yet, the complex nature of these collections makes it impossible to automatic the ingestion process that can handle both the metadata challenges and the image ingestion correctly at the first time. The Library Digital Services Team devised a new workflow that are embedded with newly developed simple tools to semi-automate the transformation of metadata and image ingestion. This presentation covers how the workflow and the tools work to enhance the efficiency of ingestion, and the thought process behind.