Visualizing Linked Open Data

A query language called SPARQL has been developed to extract knowledge from Linked Data. Composing a SPARQL query is still mainly for the technical savvy. To expose the potential of Linked Data to other less-technical domains, novel visualization techniques are essential.

We introduce “Linked Data” and why and how from a biological perspective it can be visualized graphically and also integrated in different biological software platforms. We present three visualization platforms. These are Pathvisio Loom and Cytoscape Semscape.

Cytoscape is a commonly used Open Source platform for analysing and visualizing complex networks, an extensive set of plugins is available. In the last Google Summer of Code 2012, a plugin called Semscape was developed to explore SPARQL endpoints. It allows viewing of linked data as cytoscape network. A second use case is to extract the underlying schema of a given endpoint can be extracted and then represented as a cytoscape network. For overlapping data sources the overlap, which allow the linking, can also be represented.

PathVisio is an extensive pathway editor and pathway analysis tool. As an analysis tool it allows the projection of experimental data on pathways. We have developed a plug-in called PathVisio Loom, which provides a framework for knowledge aggregation through menu guided additions. It allows the integration of biological knowledge available in the different formats (including Linked Data). Starting from a single pathway object, the curator is given a set of gene products or metabolites that relate to that pathway object and the relations themselves. This information can be imported as semantic triples or for instance as nanopublications resulting from textmining.

Although SPARQL queries are still essential to querying link data, these visualization plugins make exploring linked data in a more intuitive environment.