NSF-CSSI-2020-Poster -- C2Metadata Project

2020-01-30T16:58:18Z (GMT) by George Alter
The C2Metadata ("Continuous Capture of Metadata") Project automates the documentation of data transformations performed by statistical software. Researchers in many fields use statistical software (SPSS, Stata, SAS, R, Python) for data transformation and data management as well as analysis. C2Metadata tools translate scripts used by statistical software into an independent Structured Data Transformation Language (SDTL), which serves as an intermediate language for describing data transformations. SDTL is incorporated into standard metadata formats (Data Documentation Initiative (DDI), Ecological Markup Language (EML), and JSON-LD), which are used for data discovery, codebooks, and auditing data management scripts. C2Metadata differs from most previous approaches to provenance by focusing on documenting transformations at the variable level.