figshare
Browse

plosone_minimal_dataset.csv

Download (2.65 MB)
dataset
posted on 2025-06-17, 06:05 authored by Prasara JakkaewPrasara Jakkaew

This dataset contains the results of a novel, LLM-driven annotation process applied to expert coffee reviews, as presented in the manuscript "Automated Multi-Label Coffee Flavor Classification: A Comparative Study of BERT and TF-IDF using LLM-Driven Data Annotation."

This "Minimal Dataset" is provided to ensure the reproducibility of our findings. It includes:

  1. A unique review_id for each entry.
  2. The original blind_assessment text used as input for the LLM.
  3. The original quantitative sensory scores (Final Score, Aroma, Acidity/Structure, etc.) provided by human experts, which were used for the quantitative validation of the LLM's annotations.
  4. The final 17 columns of binary (0/1) flavor labels as generated by the LLM.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC