figshare
Browse

Supplementary file 2_Deductively coding psychosocial autopsy interview data using a few-shot learning large language model.xlsx

Download (20.4 kB)
dataset
posted on 2025-02-19, 05:12 authored by Elias Balt, Salim Salmi, Sandjai Bhulai, Stefan Vrinzen, Merijn Eikelenboom, Renske Gilissen, Daan Creemers, Arne Popma, Saskia Mérelle
Background

Psychosocial autopsy is a retrospective study of suicide, aimed to identify emerging themes and psychosocial risk factors. It typically relies heavily on qualitative data from interviews or medical documentation. However, qualitative research has often been scrutinized for being prone to bias and is notoriously time- and cost-intensive. Therefore, the current study aimed to investigate if a Large Language Model (LLM) can be feasibly integrated with qualitative research procedures, by evaluating the performance of the model in deductively coding and coherently summarizing interview data obtained in a psychosocial autopsy.

Methods

Data from 38 semi-structured interviews conducted with individuals bereaved by the suicide of a loved one was deductively coded by qualitative researchers and a server-installed LLAMA3 large language model. The model performance was evaluated in three tasks: (1) binary classification of coded segments, (2) independent classification using a sliding window approach, and (3) summarization of coded data. Intercoder agreement scores were calculated using Cohen’s Kappa, and the LLM’s summaries were qualitatively assessed using the Constant Comparative Method.

Results

The results showed that the LLM achieved substantial agreement with the researchers for the binary classification (accuracy: 0.84) and the sliding window task (accuracy: 0.67). The performance had large variability across codes. LLM summaries were typically rich enough for subsequent analysis by the researcher, with around 80% of the summaries being rated independently by two researchers as ‘adequate’ or ‘good.’ Emerging themes in the qualitative assessment of the summaries included unsolicited elaboration and hallucination.

Conclusion

State-of-the-art LLMs show great potential to support researchers in deductively coding complex interview data, which would alleviate the investment of time and resources. Integrating models with qualitative research procedures can facilitate near real-time monitoring. Based on the findings, we recommend a collaborative model, whereby the LLM’s deductive coding is complemented by review, inductive coding and further interpretation by a researcher. Future research may aim to replicate the findings in different contexts and evaluate models with a larger context size.

History

Usage metrics

    Frontiers in Public Health

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC