figshare
Browse
Poster-rd_connect.pdf (2.82 MB)

Genome Annotation using Nanopublications: An Approach to Interoperability of Genetic Data

Download (0 kB)
poster
posted on 2014-03-16, 10:10 authored by Rajaram KaliyaperumalRajaram Kaliyaperumal, Peter A.C. ‘t Hoen1, Zuotian Tatum, Mark Thompson, Erik Schultes, Andrew Gibson, Ivo F.A.C. Fokkema, Johan T. den Dunnen, Jeroen F.J. Laros, José Luis Oliveira, Pedro LopesPedro Lopes, Pedro Sernadela, Marco Roos

With the wide spread use of Next Generation Sequencing (NGS) technologies, the primary bottleneck of genetic research has shifted from data production to data analysis. However, annotated datasets produced by different research groups are often in different formats, making genetic comparisons and integration with other datasets challenging and time consuming tasks. Here, we propose a new data interoperability approach that provides unambiguous (machine readable) description of genomic annotations based on a novel method of data publishing called nanopublication. A nanopublication is a schema built on top of existing semantic web technologies that consists of three components: an individual assertion (i.e., the genomic annotation); provenance (containing links to the experimental information and data processing steps); and publication info (information about data ownership and rights, allowing each genomic annotation to be citable and its scientific impact tracked ). We use nanopublications to demonstrate automatic interoperability between individual genomic annotations from the FANTOM5 consortium (transcription start sites) and the Leiden Open Variation Database (genetic variants). The nanopublications can also be integrated with the data of the other semantic web frameworks like COEUS. Exposing legacy information and new NGS data as nanopublications promises tremendous scaling advantages when integrating very large and heterogeneous genetic datasets.

History