Brown Dog A Science Driven Data Transformation Service

With growing diverse volumes of digital data becoming part of modern scientific workflows, many research projects today begin with a process of data wrangling, i.e. finding, manipulating, indexing, cleaning, and bringing together needed datasets. Brown Dog, a Science Driven Data Transformation service, aims to alleviate much of the overhead and heterogeneity involved in this step, which in turn hinders scientific reproducibility, by providing data transformations such as format conversions and content based extractions as a service. Through a REST API Brown Dog supports diverse usage by various clients such as gateways, programming languages, and tools. As a gateway it provides a venue to access and preserve data transformation tools, track provenance, track information loss, manage data movement, and process jobs in a scalable manner across a diverse set of computational resources. Overall, Brown Dog provides a low level data infrastructure to interface with digital data contents and through its capabilities enable a new era of science and applications at large over otherwise difficult to access datasets. Further, Brown Dog aims to serve not just the scientific community but the general public as a “DNS” for data, moving civilization towards an era where applications can be largely agnostic to the format/structure of the data and can instead focus on novel processes/applications on the contents.