Background Psychedelics are gaining attention for their therapeutic potential in modern and personalized medicine. Online forums such as Erowid provide valuable user insights, but analyses of these experiences using natural language processing (NLP) remain scarce. Objective This study aims to utilize NLP, including sentiment and lexicon analysis, to examine user-generated experience reports on psilocybin-containing mushrooms and LSD from the Erowid forum. Methods Data from 2188 Erowid users (1161 psilocybin mushrooms and 1027 LSD) was collected via automated web scraping with XPath, CSS selectors, and Selenium WebDriver. The dataset included report titles, substances, and demographics. Sentiment analysis utilized BERT, RoBERTa, and VADER models. Preprocessing involved tokenization, lemmatization, part-of-speech tagging, and stop-word filtering. Lexicon analysis identified themes through recurring n-grams, visualized using Python. Results User demographics revealed comparable ages for psilocybin mushrooms (23.8 +/- 0.9 years) and LSD users (20.0 +/- 0.6 years), with a predominance of male users. The BERT model predominantly labeled experiences as negative (unfavorable), particularly for mushroom users (p = 0.001). VADER indicated more positive experiences for mushroom users (p < 0.001), while RoBERTa mainly classified experiences as negative or neutral. Significant gender differences were found only with VADER, where more male users expressed positive opinions about psilocybin mushrooms (74.09% versus 65.52%, p < 0.021). The VADER model yielded more polarized results, whereas RoBERTa's cautious classifications indicate its suitability for analyzing lengthy and complex psychedelic reports. Further, RoBERTa outperformed other transformer-based models, achieving the highest accuracy. Lexicon analysis revealed emotional, sensory, and temporal themes, with psilocybin reports emphasizing introspection and time dilation phenomenon, while LSD reports highlighted memory issues and cognitive disorientation. Conclusions Sentiment analysis showed that VADER produced more polarized results, while RoBERTa offered cautious classifications with the highest accuracy. Lexicon analysis revealed shared themes, with mushroom reports focusing on introspection and time dilation perception, while those of LSD emphasized cognitive disturbances. This study highlights the value of these analyses in understanding psychedelic experiences, informing harm reduction, and guiding policy-making.

Opinion Mining of Erowid’s Experience Reports on LSD and Psilocybin-Containing Mushrooms

Riccardo Lora;Erica Marletta;Michele Vezzaro;
2025-01-01

Abstract

Background Psychedelics are gaining attention for their therapeutic potential in modern and personalized medicine. Online forums such as Erowid provide valuable user insights, but analyses of these experiences using natural language processing (NLP) remain scarce. Objective This study aims to utilize NLP, including sentiment and lexicon analysis, to examine user-generated experience reports on psilocybin-containing mushrooms and LSD from the Erowid forum. Methods Data from 2188 Erowid users (1161 psilocybin mushrooms and 1027 LSD) was collected via automated web scraping with XPath, CSS selectors, and Selenium WebDriver. The dataset included report titles, substances, and demographics. Sentiment analysis utilized BERT, RoBERTa, and VADER models. Preprocessing involved tokenization, lemmatization, part-of-speech tagging, and stop-word filtering. Lexicon analysis identified themes through recurring n-grams, visualized using Python. Results User demographics revealed comparable ages for psilocybin mushrooms (23.8 +/- 0.9 years) and LSD users (20.0 +/- 0.6 years), with a predominance of male users. The BERT model predominantly labeled experiences as negative (unfavorable), particularly for mushroom users (p = 0.001). VADER indicated more positive experiences for mushroom users (p < 0.001), while RoBERTa mainly classified experiences as negative or neutral. Significant gender differences were found only with VADER, where more male users expressed positive opinions about psilocybin mushrooms (74.09% versus 65.52%, p < 0.021). The VADER model yielded more polarized results, whereas RoBERTa's cautious classifications indicate its suitability for analyzing lengthy and complex psychedelic reports. Further, RoBERTa outperformed other transformer-based models, achieving the highest accuracy. Lexicon analysis revealed emotional, sensory, and temporal themes, with psilocybin reports emphasizing introspection and time dilation phenomenon, while LSD reports highlighted memory issues and cognitive disorientation. Conclusions Sentiment analysis showed that VADER produced more polarized results, while RoBERTa offered cautious classifications with the highest accuracy. Lexicon analysis revealed shared themes, with mushroom reports focusing on introspection and time dilation perception, while those of LSD emphasized cognitive disturbances. This study highlights the value of these analyses in understanding psychedelic experiences, informing harm reduction, and guiding policy-making.
2025
PHARMACOEPIDEMIOLOGY PSYCHEDELICS NLP
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1179469
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact