Authorship attribution (AA) aims a recognizing the author of a piece of text. The recent diffusion of the social media brought AA to face a novel challenge: the analysis of Instant Messages (IM) conversations. IM dialogs are written texts which share many aspects with the spoken communication, but this parallelism has not taken into account so far. In this paper, we exploit such cross-media similarity, presenting novel stylometric features that encode turn-taking conversational aspects, improving the overall AA rate. In addition, we modify the classical stylometric cues calculating statistics over the turns, instead of over whole conversations. Experiments on a dyadic corpus of 77 different users report 89.52\% of accuracy, in term of normalized area under curve of cumulative match characteristic curves.
Conversationally-inspired stylometric features for authorship attribution in instant messaging
CRISTANI, Marco;ROFFO, GIORGIO;Segalin, Cristina;BAZZANI, Loris;MURINO, Vittorio
2012-01-01
Abstract
Authorship attribution (AA) aims a recognizing the author of a piece of text. The recent diffusion of the social media brought AA to face a novel challenge: the analysis of Instant Messages (IM) conversations. IM dialogs are written texts which share many aspects with the spoken communication, but this parallelism has not taken into account so far. In this paper, we exploit such cross-media similarity, presenting novel stylometric features that encode turn-taking conversational aspects, improving the overall AA rate. In addition, we modify the classical stylometric cues calculating statistics over the turns, instead of over whole conversations. Experiments on a dyadic corpus of 77 different users report 89.52\% of accuracy, in term of normalized area under curve of cumulative match characteristic curves.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.