Sentiment analysis is vital for understanding market dynamics and formulating informed investing strategies, especially in volatile financial conditions. This study advances target-based financial sentiment analysis (TBFSA) by rigorously evaluating the efficacy of Large Language Models (LLMs) in zero-shot and few-shot learning contexts. We compare cutting-edge generative LLMs, such as ChatGPT-4o, ChatGPT-4, ChatGPT-o1, DeepSeek-R1, Llama-3-8B, Gemma-2-9B, and Gemma-227B, with conventional lexicon-based tools (VADER and TextBlob) and discriminative transformer-based models (FinBERT, FinBERT-Tone, DistilFinRoBERTa, and Deberta-v3-base-absa-v1.1). Our analysis utilizes a newly curated dataset of 1,162 manually annotated Bloomberg news articles, designed explicitly for TBFSA (due to copyright constraints, only URLs are publicly released, with full news content accessible through a Bloomberg Terminal). The findings indicate that LLMs, particularly DeepSeek-R1 and ChatGPT variants (especially ChatGPT-o1), outperform lexicon-based approaches and discriminative transformer-based models across all evaluation metrics, without requiring additional training or task-specific fine-tuning. In addition, these models achieve the highest directional accuracy and statistically significant correlations with contemporaneous short-term market returns within the studied sample, demonstrating their ability to capture sentiment signals that are aligned with observed market movements. The study establishes generative LLMs as a scalable and cost-effective method for target-level sentiment analysis, relieving the need for expensive, rigorous fine-tuning. The research provides valuable insights, enabling institutions to use unstructured textual data effectively for sentiment monitoring, market analysis, and risk assessment.

Benchmarking large language models for target-based financial sentiment and stock return

Iftikhar Muhammad
;
Marco Rospocher;
2026-01-01

Abstract

Sentiment analysis is vital for understanding market dynamics and formulating informed investing strategies, especially in volatile financial conditions. This study advances target-based financial sentiment analysis (TBFSA) by rigorously evaluating the efficacy of Large Language Models (LLMs) in zero-shot and few-shot learning contexts. We compare cutting-edge generative LLMs, such as ChatGPT-4o, ChatGPT-4, ChatGPT-o1, DeepSeek-R1, Llama-3-8B, Gemma-2-9B, and Gemma-227B, with conventional lexicon-based tools (VADER and TextBlob) and discriminative transformer-based models (FinBERT, FinBERT-Tone, DistilFinRoBERTa, and Deberta-v3-base-absa-v1.1). Our analysis utilizes a newly curated dataset of 1,162 manually annotated Bloomberg news articles, designed explicitly for TBFSA (due to copyright constraints, only URLs are publicly released, with full news content accessible through a Bloomberg Terminal). The findings indicate that LLMs, particularly DeepSeek-R1 and ChatGPT variants (especially ChatGPT-o1), outperform lexicon-based approaches and discriminative transformer-based models across all evaluation metrics, without requiring additional training or task-specific fine-tuning. In addition, these models achieve the highest directional accuracy and statistically significant correlations with contemporaneous short-term market returns within the studied sample, demonstrating their ability to capture sentiment signals that are aligned with observed market movements. The study establishes generative LLMs as a scalable and cost-effective method for target-level sentiment analysis, relieving the need for expensive, rigorous fine-tuning. The research provides valuable insights, enabling institutions to use unstructured textual data effectively for sentiment monitoring, market analysis, and risk assessment.
2026
Target-based financial sentiment analysis, Generative large language models, Discriminative transformer-based models, Lexicon-based methods, Stock returns
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1197307
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact