figshare
Browse
1/1
2 files

Reddit Health Taxonomy

Version 2 2021-06-09, 13:21
Version 1 2021-06-09, 13:11
dataset
posted on 2021-06-09, 13:21 authored by Sanja ScepanovicSanja Scepanovic, Luca Maria AielloLuca Maria Aiello
This is the dataset for the resulting Reddit Health Taxonomy for the paper [1] (linked below).

Data Format: each line corresponds to one symptom community (Level-2) having the following format:

depression:anxiety 1:1 519 6A73 Mixed depressive and anxiety disorder 314468192 anxieti,depress,panic attack,anxious,social anxieti,suicidal thought,...

depression:anxiety -- is the community name, i.e., its corresponding condition,
1:1 -- community id,
519 -- number of symptoms in the community,
6A73 Mixed depressive and anxiety disorder 314468192 -- ICD-11 code corresponding to the selected condition,
anxieti,depress,panic attack,anxious,social anxieti,suicidal thought,... -- symptoms sorted by their importance for this community (InfoMap flow value; see the paper for more info).

Empty lines separate the Level-1 communities that consist of several Level-1 communities.


[1] Šćepanović, S., Aiello, L. M., Zhou, K., Joglekar, S., & Quercia, D. (2021, May). The Healthy States of America: Creating a Health Taxonomy with Social Media. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 15, pp. 621-632).

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC