CSympData: Expert Annotated Patient Symptoms Data
Access to timely and accurate healthcare guidance remains a challenge, particularly in Low- and Middle-Income Countries (LMICs) with limited primary care resources. Many individuals delay seeking professional medical consultation due to a lack of awareness, financial constraints, or difficulty in assessing the urgency of their symptoms. AI-powered symptom checkers can play a crucial role in addressing these challenges by providing preliminary health assessments and guiding individuals toward appropriate medical actions.
To support the development of AI-driven clinical decision support tools, we introduce a comprehensive symptom dataset focused on patient symptoms, their severity, duration, and demographic attributes (age, gender). Expert-labeled recommendations indicate whether self-care, over-the-counter (OTC) medication, or a doctor’s consultation is required. This dataset is derived from publicly available health records and surveys in Bangladesh. The dataset consists of 130,637 cases, comprising 4,466 instances of OTC drug recommendations and 126,171 instances of doctor consultation recommendations. The dataset follows the Mortality and Morbidity Statistics from the International Classification of Diseases, 11th Revision (ICD-11), to ensure compatibility with international healthcare frameworks.
List of Dataset Attributes:
- Symptoms – List of symptoms
- Gender – Patient's gender (e.g., Male, Female)
- Age – Patient's age in years
- Duration – Duration of symptoms (e.g., in days)
- Severity – Severity level of symptoms (e.g., Mild, Moderate, Severe)
- Final Recommendation – Suggested action (e.g., OTC drug recommendation or doctor consultation)