Designing for Serendipity: Research Data Curation in Topic Spaces
presentationposted on 12.07.2020, 19:14 by Sara Lafia
Researchers seeking relevant data across disciplines confront the challenge of navigating technical descriptions. How can curation support the serendipitous discovery of related research data? Everyday spaces like bookshelves are designed to support browsing and exploration by placing similar resources closer together. Space and time are foundational ordering relations for knowledge organization. I ask how this ordering, which is well-established in the geographic context, can be translated to locate and organize research data in abstract topic spaces. This dissertation develops methods for making the latent topics of research metadata explicit. These methods produce spatial configurations where related research topics are co-located in neighborhoods. This has the potential to support serendipitous discovery by offering researchers ways to discover related data. I test this notion in three studies that develop topic spaces for research data curation. The first part of this dissertation in Chapter 2 focuses on supporting research data discovery with a common terminology. I develop a crosscutting base vocabulary of geospatial topics to help users discover related government data in a ubiquitous open civic data platform. Semantic annotation expands search terms by mapping users’ vernacular onto the language of metadata. In the second part of this dissertation, I shift away from addressing terminological search to supporting spatial curation by developing topic spaces. In Chapter 3, I develop two kinds of topic spaces for curating research theses and dissertations: landscapes and networks. I use topic modeling to determine the latent semantic similarity of research metadata and then produce topic spaces from these using spatialization techniques. In Chapter 4, I spatialize an institute’s multidisciplinary body of research, producing topic maps at two distinct levels of detail. Emerging spatial patterns, like centrality and proximity, support high-level narratives about cross-disciplinary research activities that complement the quantitative metrics currently cited in reviews of institutional research. Together, these three studies demonstrate strategies for developing topic spaces in which diverse, yet related, multidisciplinary research data are curated. Future research will extend these methods by tracing the impact of specific curatorial actions contributing to research data discovery and reuse.