RECOGNIZING HEALTH CONCEPTS IN TWITTER DATA USING LARGE LANGUAGE MODEL’S
This thesis presents a structured framework leveraging large language models (LLMs)—GPT-4-0613 via LangChain, GPT-4 Turbo, and Gemini 2.0 Flash—for extracting, normalizing, and categorizing COVID-19 symptoms from informal Twitter posts. Using a pre-annotated dataset of 635 tweets as ground truth, the study evaluates each model’s ability to identify symptoms and temporal references expressed through varied, often non-clinical language.
To address LLM non-determinism, the framework introduces a consensus mechanism across three inference runs per model. Outputs are semantically matched, normalized, and categorized using prompt-driven Gemini 2.0 Flash models to ensure consistency across all stages. The evaluation metrics include accuracy, precision, recall, and F1-score, with GPT-4-0613 demonstrating the highest overall performance.
The study further visualizes results through a 3D symptom-day-category data cube to support trend analysis. Findings highlight the potential of LLMs, when combined with prompt engineering and ensemble strategies, to enhance public health surveillance from social media data streams. This reproducible pipeline offers a scalable solution for timely health monitoring and can generalize to other diseases and platforms.
History
Degree Type
- Master of Science
Department
- Computer and Information Technology
Campus location
- Hammond