figshare
Browse

sorry, we can't preview this file

SDM-Genomic-Testbeds.zip (6.32 GB)

SDM-Genomic-Testbeds

Download (6.32 GB)
Version 3 2022-12-21, 15:01
Version 2 2022-01-24, 13:06
Version 1 2021-06-24, 13:36
dataset
posted on 2022-12-21, 15:01 authored by Samaneh JozashooriSamaneh Jozashoori

These datasets are generated from cosmic mutation dataset in COSMIC database (GRCh37, version90) with the purpose of evaluating available ontology-based Data Integration engines.They include datasets with different number of records (10k, 100k, 1 million, and 10 million records), attributes (2-15), and duplicated values (25-75 percent of duplicated records and each duplicated value being repeated 10/20 times).


The mappings consist of different complexities.


The details of generation of these datasets can be found in the papers where they have been used in empirical evaluation:  

https://www.semantic-web-journal.net/system/files/swj3289.pdf


https://www.semantic-web-journal.net/system/files/swj3246.pdf


ttps://doi.org/10.1145/3340531.3412881


 https://doi.org/10.1145/3477314.3507132 

 


History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC