figshare
Browse
1/1
6 files

Extracted Schemas from the Life Sciences Linked Open Data Cloud

Version 2 2020-06-01, 04:09
Version 1 2020-06-01, 03:49
dataset
posted on 2020-06-01, 04:09 authored by Maulik KamdarMaulik Kamdar
This dataset is related to the manuscript "An empirical meta-analysis of the life sciences linked open data on the web" published at Nature Scientific Data.

If you use the dataset, please cite the manuscript as follows:
Kamdar, M.R., Musen, M.A. An empirical meta-analysis of the life sciences linked open data on the web. Sci Data 8, 24 (2021). https://doi.org/10.1038/s41597-021-00797-y

We have extracted schemas from more than 80 publicly available biomedical linked data graphs in the Life Sciences Linked Open Data (LSLOD) cloud into an LSLOD schema graph and conduct an empirical meta-analysis to evaluate the extent of semantic heterogeneity across the LSLOD cloud.

The dataset published here contains the following files:
- The set of Linked Data Graphs from the LSLOD cloud from which schemas are extracted.
- Refined Sets of extracted classes, object properties, data properties, and datatypes, shared across the Linked Data Graphs on LSLOD cloud. Where the schema element is reused from a Linked Open Vocabulary or an ontology, it is explicitly indicated.
- The LSLOD Schema Graph, which contains all the above extracted schema elements interlinked with each other based on the underlying content. Sample instances and sample assertions are also provided along with broad level characteristics of the modeled content.

The LSLOD Schema Graph is saved as a JSON Pickle File. To read the JSON object in this Pickle file use the Python command as follows:
with open('LSLOD-Schema-Graph.json.pickle' , 'rb') as infile:
x = pickle.load(infile, encoding='iso-8859-1')

Check the Referenced Link for more details on this research, raw data files, and code references.

Funding

GM086587

GM103316

HG004028

History