FAIR LINCS Metadata Powered by CEDAR Cloud-Based Templates and Services
posterposted on 2017-05-12, 18:02 authored by Daniel Cooper, Amar Koleti, Martin O'Connor, Debra WillrettDebra Willrett, Caty ChungCaty Chung, John Greybeal, Mark Musen, Stephan Schurer
The Library of Integrated Network-based Signatures (LINCS) program generates a wide variety of cell-based perturbation-response signatures using diverse assay technologies. For example, LINCS includes large-scale transcriptional profiling of genetic and small molecule perturbations, and various proteomics and imaging datasets. We currently obtain metadata through an online platform, the metadata submission tool (MST), based off the use of spreadsheet data templates. While functional, it remains difficult to maintain FAIR standards, specifically remaining findable and re-usable, for metadata without (enforced) controlled vocabulary and internally built linkages to ontologies and metadata standards. To maintain FAIR-centric metadata, we have worked with the Center for Enhanced Data Annotation and Retrieval (CEDAR), to develop modular metadata templates linked to ontologies and standards present in the NCBO Bioportal. We have also developed a new LINCS Dataset Submission Tool (DST), which links new LINCS datasets to the form-fillable CEDAR templates. This metadata management framework supports authoring, curation, validation, management, and sharing of LINCS metadata, while building upon the existing LINCS metadata standards and data-release workflows. Additionally, the CEDAR technology facilitates metadata validation and testing testing, enabling users to ensure their input metadata are LINCS compliant prior to submission for public release. CEDAR templates have been developed for reagent metadata, experimental metadata, to describe assays, and to capture global dataset attributes. Integrating the submission of all these components into one submission tool and workflow we aim to significantly simplify and streamline the workflow of LINCS dataset submission, processing, validation, registration, and publication. As other projects apply the same approach, many more datasets will become cross-searchable and can be linked optimizing the metadata pathway from submission to discovery.