figshare
Browse

Supplier Triplet Dataset

Download (1.62 MB)
journal contribution
posted on 2024-06-24, 14:50 authored by Yunqing LiYunqing Li, Hyunwoong Ko, Farhad Ameri

The supplier triplet dataset is constructed to train and validate the ability of Large Language Models (LLMs) to extract supplier triplets from unstructured textual data. The prompts in the dataset comprise page texts extracted from 1,000 supplier web links in North Carolina, United States. The completions in the dataset contain all the triplets extracted from these prompts. For the completion, subjects and predicates are predefined by the SUDOKN ontology, and objects are mostly taken directly from the web pages to maintain accuracy, with the remaining objects double-checked and standardized by Subject Matter Experts (SMEs) according to the SUDOKN ontology. SMEs also harmonize terms to industry standards, such as standardizing “Automotive-ICE” to “Automotive” and “Metal-Aluminum” to “Aluminum”.

Funding

Proto-OKN Theme 1 - Supply and Demand Open Knowledge Network (SUDOKN)

Directorate for Technology, Innovation and Partnerships

Find out more...

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC