The Phylogeny of a Dataset
The field of evolutionary biology offers many approaches to study the changes that occur between and within generations of species; these methods have recently been adopted by cultural anthropologists, linguists and archaeologists to study the evolution of physical artifacts. In this paper, we further extend these approaches by using phylogenetic methods to model and visualize the evolution of a long-standing, widely used digital dataset in climate science.
Our case study shows that clustering algorithms developed specifically for phylogenetic studies in evolutionary biology can be successfully adapted to the study of digital objects, and their known offspring. Although we note a number of limitations with our initial effort, we argue that a quantitative approach to studying how digital objects evolve, are reused, and spawn new digital objects represents an important direction for the future of Information Science. [Presentation from ASIST 2014, Seattle]