CATALOGO DEI PRODOTTI DELLA RICERCA

In this paper, we apply the ChatGPT Large Language Model (gpt-3.5-turbo) to the 4books dataset, a German language collection of children’s and young adult novels comprising a total of 22,860 sentences annotated for valence by 80 human raters. We verify if ChatGPT can (a) compare to the behaviour of human raters and/or (b) outperform state of the art sentiment analysis tools. Results show that, while inter-rater agreement with human readers is low (independently from the inclusion/exclusion of context), efficiency scores are comparable to the most advanced sentiment analysis tools.

Comparing ChatGPT to Human Raters and Sentiment Analysis Tools for German Children’s Literature

Simone Rebora;Marina Lehmann;Anne Heumann;Wei Ding;Gerhard Lauer

2023-01-01

Abstract

In this paper, we apply the ChatGPT Large Language Model (gpt-3.5-turbo) to the 4books dataset, a German language collection of children’s and young adult novels comprising a total of 22,860 sentences annotated for valence by 80 human raters. We verify if ChatGPT can (a) compare to the behaviour of human raters and/or (b) outperform state of the art sentiment analysis tools. Results show that, while inter-rater agreement with human readers is low (independently from the inclusion/exclusion of context), efficiency scores are comparable to the most advanced sentiment analysis tools.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Parole Chiave
	
				Large Language Models, ChatGPT, 4books dataset, sentiment analysis, inter-rater agreement
			
	Appare nelle tipologie:
	
				04.01 Contributo in atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1115888

Citazioni

ND

5

ND

social impact