figshare
Browse

Sentiment Analysis Test Dataset Created from Two COVID-19 Surveys: National Institutes of Health (NIH) and Stanford University

Version 2 2024-01-09, 07:25
Version 1 2023-11-22, 20:13
dataset
posted on 2024-01-09, 07:25 authored by Juan Antonio Lossio-VenturaJuan Antonio Lossio-Ventura, Rachel Weger, Angela Lee, Emily Guinee, Joyce Chung, Atlas, Lauren, Eleni Linos, Francisco PereiraFrancisco Pereira

Two COVID-19 surveys were used to create the test dataset, both collected by teams from the National Institutes of Health (NIH) and Stanford University. The collected data were intended to assess the general topics experienced by participants during the pandemic lockdown. The test dataset comprises a total of 1,000 randomly chosen sentences, with 500 sentences selected from each survey. Each set was annotated by three separate and independent annotators. The annotators were instructed to assess the polarity of each sentence on a scale of -1 (negative), 0 (neutral), or 1 (positive). We then followed a three-step procedure to determine the final labels. First, if all three annotators agreed on a label (full agreement), that label was accepted. Second, if two out of the three agreed on a label (partial agreement), that label was also accepted. Third, if there was no agreement, the label was set as neutral (no agreement).

Funding

Clinical Research in the NIMH Office of the Clinical Director

National Institute of Mental Health

Find out more...

Machine Learning Team

National Institute of Mental Health

Find out more...

Large-Scale Online stimulus Norming and Surveys about Perceptions in Healthcare

National Center for Complementary and Integrative Health

Find out more...

Patient Oriented Research in Vulnerable Populations with Skin Disease

National Institute of Arthritis and Musculoskeletal and Skin Diseases

Find out more...

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC