An analysis of cannabinoid hyperemesis syndrome Reddit posts and themes

Abstract Introduction Reddit hosts a large active community of members dedicated to the discussion of cannabinoid hyperemesis syndrome. We sought to describe common themes discussed and the most frequently mentioned triggers and therapies for cannabinoid hyperemesis syndrome exacerbations in the Reddit online community. Methods Data collected from six subreddits were filtered using natural language processing to curate posts referencing cannabinoid hyperemesis syndrome. Based on a manual review of posts, common themes were identified. A machine learning model was trained using the manually categorized data to automatically classify the themes for the rest of the posts so that their distributions could be quantified. Results From August 2018 to November 2022, 2683 unique posts were collected. Thematic analysis resulted in five overall themes: cannabinoid hyperemesis syndrome-related science; symptom timing; cannabinoid hyperemesis syndrome treatment and prevention; cannabinoid hyperemesis syndrome diagnosis and education; and health impacts. Additionally, 447 trigger and 664 therapy-related posts were identified. The most commonly mentioned triggers for cannabinoid hyperemesis syndrome episodes included: food and drink (n = 62), cannabinoids (n = 45), mental health (e.g., stress, anxiety) (n = 27), and alcohol (n = 22). Most commonly mentioned cannabinoid hyperemesis syndrome therapies included: hot water/bathing (n = 62), hydration (n = 60), antiemetics (n = 42), food and drink (n = 38), gastrointestinal medications (n = 38), behavioral therapies (e.g., meditation, yoga) (n = 35), and capsaicin (n = 29). Discussion Reddit posts for cannabinoid hyperemesis syndrome provide a valuable source of community discussion and individual reports of people experiencing cannabinoid hyperemesis syndrome. Mental health and alcohol were frequently reported triggers within the posts but are not often identified in the literature. While many of the therapies mentioned are well documented, behavioral responses such as meditation and yoga have not been explored by the scientific literature. Conclusions Knowledge shared via online social media platforms contains detailed information on self-reported cannabinoid hyperemesis syndrome disease and management experiences, which could serve as valuable data for the development of treatment strategies. Further longitudinal studies in patients with cannabinoid hyperemesis syndrome are needed to corroborate these findings.


Introduction
Cannabinoid hyperemesis syndrome, first described in 2004 [1], is characterized by episodes of cyclic vomiting associated with abdominal pain in the context of chronic daily or neardaily cannabis use. Symptomatic episodes are separated by periods of baseline health, and cannabis cessation has been the only documented treatment to date that results in disease resolution [2][3][4]. Prior to and during the diagnostic process, patients suspected to have cannabinoid hyperemesis syndrome frequently have multiple emergency department (ED) visits and hospitalizations, and undergo non-diagnostic advanced imaging, such as computerized tomography and endoscopy [5]. Thus far, most medical literature on cannabinoid hyperemesis syndrome has been focused on describing the clinical syndrome, attempting to determine diagnostic criteria to differentiate cannabinoid hyperemesis syndrome from other cyclic vomiting syndromes not associated with cannabis use, and evaluating acute treatment of nausea and vomiting [2,[6][7][8][9]. Data on longer-term treatment options, patient experience, disease stages and timeline, and prevention strategies are limited [10]. There is no consensus on the root cause of cannabinoid hyperemesis syndrome or understanding of why cyclic vomiting occurs in some individuals with chronic cannabis use but not others.
Knowledge shared via online social media platforms may contain a scope of self-reported management experiences different than that found in the scientific literature, which can be valuable for clinicians and researchers. Online forums (e.g., Reddit, Facebook) have been used to access information, seek advice, and form communities among a range of individuals [11,12]. Many individuals suffering from suspected cannabinoid hyperemesis syndrome have turned to online communities and message boards to share knowledge and their experiences [13,14]. In this study, we focus on Reddit, a popular publicly available user-driven discussion platform structured around communities called subreddits that offers user anonymity. Reddit hosts a large active community dedicated to the discussion of cannabinoid hyperemesis syndrome, with over 11,900 active members as of January 2023 [13]. To better understand patient needs and experiences with cannabinoid hyperemesis syndrome, we sought to describe themes discussed in the Reddit online community. Additionally, we conducted a focused evaluation of reported triggers and therapies for symptomatic cannabinoid hyperemesis syndrome episodes in posts to guide patient-centered care approaches including treatment strategies and future research directions. Figure 1 is a flowchart of the data collection and analysis process. Our first step in collecting relevant data about cannabinoid hyperemesis syndrome from Reddit involved identifying the relevant subreddits on which cannabinoid hyperemesis syndrome-related information is discussed. Subreddits are forums dedicated to specific topics, which are often about substances or experiences with substances. We searched for the phrases "CHS" and "cannabinoid hyperemesis syndrome" on the web-based interface to identify posts related to the topic. Retrieved posts were manually reviewed by AB and SL (Abeed Sarker and Sahithi Lakamana), and the subreddits on which the posts occurred were noted. After identifying relevant subreddits in this fashion, we collected all retrievable data from these subreddits using the Python Reddit application programming interface wrapper [15]. The Python Reddit application programming interface allows programs to connect to Reddit and retrieve publicly available posts. Subreddits that are not public or posts that are removed by moderators, or the original posters are not retrievable via the application programming interface. We extracted data from the following subreddits: r/CHSinfo, r/CHSline, r/CHSexploration, r/Altcannabinoids, r/cannabinoidhyperemesis. Posts that were repeated or reposted (e.g., the same post on multiple subreddits) were identified via natural language processing methods and excluded. Specifically, we preprocessed the text for each post by lowercasing and removing punctuations and only included the post if it did not exactly match the text of an already included post. Some posts also contained no text other than in the title (e.g., posts with images only), and they were excluded as well. This study was determined by the IRB at Emory to be exempt (category 4; publicly available data).

Theme identification
Natural language processing was used to extract posts from the retrieved sample that were about cannabinoid hyperemesis syndrome. We used the abbreviated term "CHS" and also the expanded form "cannabinoid hyperemesis syndrome" to search for posts. Due to the occasional misspelling of the expanded form of cannabinoid hyperemesis syndrome by Reddit subscribers, we used an automatic lexical variant generator to generate common misspellings for searching (e.g., canabinoid, cannabinoide, hypermesis) [16]. Posts that mentioned at least one expression/variant were extracted. Reddit also allows subscribers to upvote or downvote posts indicating their usefulness or other positive or negative traits. As a result, posts that are deemed useful to many subscribers or provide important information accumulate more upvotes than downvotes. To select posts for manual analysis, we first ranked them by their "net score," which is the difference between their upvotes and downvotes. We set the net score threshold for inclusion at 30 after reviewing the distribution of net scores.
Following the score-based ranking of posts, two experts RSW and JP (Rachel S. Wightman and Jeanmarie Perrone) performed manual, thematic categorization of a sample of posts. Consistent with past works on similar thematic analyses, the experts first reviewed the posts and identified topics. These topics reflected the specific information represented by the posts, such as "Outcomes/Health Consequences" and "Doubts about this being a real disease." In similar text coding problems, annotation guidelines are often developed, and then agreements between multiple annotators are measured using metrics such as Cohen's kappa [17]. These guidelines document the rules that human coders use for the categorization of posts. In this analysis, however, it was not possible to prepare an annotation guideline a priori due to the novelty of the subject and the lack of pre-existing knowledge of likely topic categories. Due to these factors, it was decided that the two experts would resolve disagreements via weekly discussions rather than attempt to compare their levels of agreement on numerically unbounded topic categories.
The initial annotation guidelines were used to code a larger set of posts. The number of posts that could be manually coded was dictated by time constraints. Following the initial round of coding, the topics associated with the posts were grouped into coarse-grained themes. The full distributions of themes and subthemes are presented in the Results section.

Analysis of topics
Among the topics discussed by Reddit users, two were of high interest for this specific analysis: triggers of cannabinoid hyperemesis syndrome and therapeutic strategies. We applied natural language processing in a recursive manner to identify lexical expressions associated with these topics. For example, one common trigger for cannabinoid hyperemesis syndrome was "food." We discovered these common expressions during our initial thematic analysis. We then used these expressions (e.g., "food" for trigger) to automatically identify, via simple natural language processing-driven searches, posts from all collected ones that mentioned the expressions. Since it is common for subscribers to mention multiple triggers or therapeutic strategies together, we manually analyzed these posts to identify additional relevant expressions for triggers and therapies. An additional iteration of searching for the new expressions discovered in the previous iteration was then performed to identify new posts, and we similarly manually reviewed them to identify more relevant expressions. We conducted a total of three iterations, after which it was determined that there were either no new expressions or very few expressions being found, thus requiring no further iterations. We grouped expressions that had the same or similar meanings and computed their frequencies in the sample.

Machine learning for identifying the distribution of themes
The manually assigned codes from the thematic analysis were used to train and evaluate several supervised machinelearning methods. The objectives were to (i) identify a machine learning model that has the best agreement with the human experts for automatically classifying the themes of cannabinoid hyperemesis syndrome posts and (ii) to use the best-performing model to automatically classify all the posts that were collected but could not be reviewed because of the time-consuming nature of manual annotation.
First, all manually coded data were divided into training (60%), validation (20%), and test (20%) sets. The training and validation sets were used for optimizing the machine learning models, while the test set was held out for evaluation. We experimented with three machine learning modelsnaïve Bayes (baseline), Random Forest, and RoBERTa [18]. Naive Bayes is typically used as a baseline system; Random Forest classifiers have been state-of-the-art for text classification for many years, and transformer-based classifiers such as RoBERTa have taken over as the leading method for text classification relatively recently. One disadvantage of RoBERTa, compared to Random Forest, is that it has a token length limitation for documents being classified, and many Reddit posts exceed this length limit. Further details about the classifiers and the text processing performed are provided in the supplementary material. Class-specific and overall (macro-averaged) F 1 -scores were computed over the test set for all the classifiers and compared. Following the identification of the best-performing classifier, it was used to classify all the unlabeled posts we collected. We used this to obtain an estimation of the distribution of themes across all posts and also their distributions over time.

Results
A total of 9691 posts were collected from the chosen subreddits via the Application Programming Interface. The collected data ranged from August 2018 to November 2022. Four thousand seven hundred sixty-three posts matched the search expressions indicating cannabinoid hyperemesis syndrome. After removing posts that met the exclusion criteria (e.g., duplicate posts), a total of 2683 remained. Sixty-nine posts met the net score threshold (i.e., higher than 30). The manual thematic analysis resulted in a total of five themes and 18 subthemes. The distribution of the themes is shown in Table 1. Eighteen posts could not be classified into any of the themes. Table 2 presents the performances of the three classifiers on the annotated test set. The Random Forest classifier produced the best performance overall. Table 3 presents the distribution of themes over all the posts that were not manually coded. Note that the counts do not add up to 2683 because some of the posts were classified as N/A (not available).

Analysis of topics
A total of 447 posts were reviewed in all the iterations for identifying triggers. The most commonly mentioned triggers for cannabinoid hyperemesis syndrome episodes were: food and drink (n ¼ 62), cannabinoids (n ¼ 45), mental health (e.g., stress, anxiety) (n ¼ 27), alcohol (n ¼ 22), and consumption behaviors (e.g., vaping) (n ¼ 12). Within the food and drink category, the general mention of "food" was the most frequently noted trigger for cannabinoid hyperemesis syndrome symptoms (n ¼ 34); more specific food or drink items discussed as triggers included: coffee, chocolate, greasy food or fast food, meat, sauces (e.g., tomato sauce, cream sauce), black pepper, spicy food, kava, black tea, cinnamon, carrot, cheese, and omega-3.

Health impact theme
Most posts classified within the health impact theme described physical symptoms and medical outcomes (e.g., dehydration, hypokalemia, kidney injury), but others centered on the social and occupational impacts of cannabinoid hyperemesis syndrome. Two such topics discussed included friend and family-related consequences of cannabinoid hyperemesis syndrome and impacts on work and school. A series of sample excerpts from posts on these topics are provided in Table 4.

Discussion
Uncertainty about the root cause of cannabinoid hyperemesis syndrome was reflected in the high volume of posts around the theme "science," including many posts that offered explanations for drivers of cannabinoid hyperemesis syndrome or questioned the validity of cannabinoid hyperemesis syndrome as a disease. Given the lack of clarity about the disease mechanism or understanding of why cannabinoid hyperemesis syndrome occurs in some individuals with chronic cannabis use but not others, it makes sense that individuals turn to online communities for answers. Most patients with cannabis-related cyclic vomiting receive a preliminary diagnosis of cannabinoid hyperemesis syndrome from the ED, as this is the most common location where patients present for acute treatment to control active symptoms [2,5]. Due to limited outpatient treatment resources and long wait times to get specialty care (e.g., mental health care, gastroenterology), patients often have unanswered  questions, including uncertainty about the diagnosis of cannabinoid hyperemesis syndrome and the disease process itself [19,20]. Knowledge of the time course of first cannabis use or daily use to cyclic vomiting symptom onset, episode and disease progression, and time from cessation of use to the resolution of symptoms is extremely limited [6,21]. Longitudinal follow-up data on patients with suspect cannabinoid hyperemesis syndrome after the ED visit is urgently needed to better understand the episode and disease progression and to better provide guidance to patients with chronic cannabis use experiencing cyclic vomiting. Current published followup literature is limited to case reports and series, most of which only follow patients for a few days to weeks after a symptomatic episode [10]. This lack of data and understanding of the timing of cannabinoid hyperemesis syndrome symptoms and disease progression is reflected in the high frequency of Reddit posts discussing these themes and questions. The types of timing information reported (e.g., time since onset cannabis use to symptoms, duration of symptoms, lifetime cannabis use years) and the use of different forms of temporal information often lacked a clear time reference point. Further analysis of the timing theme data requires the use of more advanced natural language processing methods and methods to extract and was not feasible for this study but should be explored in future work.
Topic posts on cannabinoid hyperemesis syndrome triggers highlighted multiple factors that, to our knowledge, have not been previously explored in the scientific literature. In addition to mental health triggers (e.g., anxiety), food and beverage categories, including alcohol, were frequently mentioned in trigger-related posts. Many, but not all, of the food items listed in the trigger category were highly acidic foods (e.g., spicy food, greasy food, coffee, and black tea). However, the general "food" category was broad, making it difficult to derive strong associations of cyclic vomiting episodes with any specific food item. Frequent alcohol and cannabis co-use has been previously documented on a population level, but the impact of these factors on cannabinoid hyperemesis syndrome should be examined in future work.
Treatment data on cannabinoid hyperemesis syndrome is largely limited to acute symptom management in healthcare settings. Treatment content on Reddit reflected the medical literature, including mentions of hot water bathing, topical capsaicin cream, and dopamine antagonists as frequent treatment approaches for acute episodes [22][23][24]. However, more general treatment strategies, including hydration, gastrointestinal medications to relieve spasms and reduce acid reflux, probiotics, and mild food items (e.g., crackers, soup), were also frequently mentioned. Notably, behavioral therapies (i.e., meditation and yoga) were mentioned in multiple posts. These user-generated posts provide novel insights into individual behavior, including the patient experience outside of the ED and the potential utility of holistic treatment approaches. Health impact-related posts centered not only on physical symptoms and medical outcomes but also noted the social and occupational impacts of cannabinoid hyperemesis syndrome, which are infrequently mentioned in the scientific literature and need to be considered when offering support and treatment referrals for patients.
This study has several limitations. We were not able to access demographics or geography of individuals who posted on Reddit. The user base of social media platforms such as Reddit may not be representative of the population suffering from cannabinoid hyperemesis syndrome. For example, prior reports have documented that people who use Reddit are majority male, younger, and non-Hispanic white [25]. However, it is likely that the base of people using Reddit has expanded since that report. Our search strategy could have led to missed reports, and individuals whose cannabinoid hyperemesis syndrome experiences are most severe or who have negative health outcomes could be more likely to post about it on social media. Finally, only 69 posts were manually reviewed for the thematic analysis and training of machine learning models. This was due to the relatively long length of Reddit posts. While we applied a net vote-based filtering to ensure that the manually reviewed posts were highly informative, our findings and the performances of the trained machine learning models were limited by this number. Despite these limitations, this analysis was a first step to describing cannabinoid hyperemesis syndrome in Reddit community posts and documenting general themes. Future work will be needed to provide more detailed content evaluations to further understand cannabinoid hyperemesis syndrome.

Conclusions
Cannabinoid hyperemesis syndrome subreddit posts provide a valuable source of community discussion and experience reports of individuals with cannabinoid hyperemesis syndrome and should be taken into consideration in the development of future patient-centered treatment approaches and investigations. Reddit subscribers frequently discuss well-known triggers and therapies, but also mention themes that are rarely captured in the medical literature. Further Table 4. Health impacts theme sample excerpts from CHS Reddit posts.
Friends and Family Impacts … i have missed moments with family and friends, i have stopped resting properly for months and i feel that this accumulated fatigue is beginning to affect my day to day. i really like how i feel when i'm high but it's really not worth it. i want to start a new stage in my life, be more responsible, dedicate myself to work, family, friends, good times. … i can have people stay over or stay at friends/family house without constant fear of getting sick or actually getting sick and having to drive home throwing up the whole way and the list goes on !! … it hasn't been easy and it sucks knowing the pain and worry i've put my family through. chs is very real. don't be a dumbass like me Work and School Impacts … and i'm finally, finally quiting for good because i can't keep a job or take care of my son when i have these episodes. also, it's the Ã only Ã times i've ever wanted to … die. … i have a full time job. this most recent bout of chs has put me out of work for over a month and it has absolutely ruined my life. i miss my job so much … it was a monkey, rather a gorilla, in my gut, that kept me from a normal life, it cost me jobs, it cost me time, it cost me water and power bill money. … had to take a month brake from school sadly to get everything sorted out after being hospitalized for chs.
longitudinal studies in patients with cannabinoid hyperemesis syndrome are needed to corroborate these findings.