CATALOGO DEI PRODOTTI DELLA RICERCA

This annotated dataset consists of 807,707 English language song lyrics, each tagged with the information on whether the lyrics contains explicit content, i.e., unsuitable for children. The dataset, built starting from Spotify and LyricWiki content, was developed to support the training and evaluation of automatic tools for detecting explicit song lyrics. The construction of the dataset is described in the following associated publication (c.f. Section 4.1): Marco Rospocher. Explicit song lyrics detection with subword-enriched word embeddings. In Expert Systems with Applications, Volume 163, January 2021, 113749 DOI: 10.1016/j.eswa.2020.113749

Dataset for explicit lyrics detection

Marco Rospocher

2021-01-01

Abstract

This annotated dataset consists of 807,707 English language song lyrics, each tagged with the information on whether the lyrics contains explicit content, i.e., unsuitable for children. The dataset, built starting from Spotify and LyricWiki content, was developed to support the training and evaluation of automatic tools for detecting explicit song lyrics. The construction of the dataset is described in the following associated publication (c.f. Section 4.1): Marco Rospocher. Explicit song lyrics detection with subword-enriched word embeddings. In Expert Systems with Applications, Volume 163, January 2021, 113749 DOI: 10.1016/j.eswa.2020.113749

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Parole chiave
	
				machine learning, explicit content, text classification
			
	Appare nelle tipologie:
	
				07.10 Banca dati

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1059799

Citazioni

ND

ND

ND

social impact