The significance of user-generated content as a source for business intelligence and analytics has been on the rise since the inception of electronic commerce platforms and has been solidified in the wake of the pandemic due to the prominence of electronic commerce as a sales channel. The prevailing approach to harnessing unstructured data involves the utilization of Artificial Intelligence; however, there exist simpler alternatives capable of yielding valuable information. This article introduces a methodology grounded in information theory to quantify the semantic disparity between the consumer community and product descriptions. This disparity can result in potential misunderstandings in the dialogue among consumers, and incidental costs in the dialogue between consumers and vendors. One plausible explanation for this disparity is that the terminology employed by consumers may possess different meanings compared to that utilized by product description writers. Our methodology employs large corpora of consumer reviews and product descriptions to quantify this semantic disparity across multiple electronic commerce domains through the implementation of random word exchanges and compression. Furthermore, we utilize neural word embeddings to identify specific words exhibiting the greatest semantic drift between reviews and descriptions, thereby providing lexical examples of these gaps. Our findings indicate that lower levels of lexical-semantic gap are associated with better consumer satisfaction.

Measuring semantic gap between user-generated content and product descriptions through compression comparison in e-commerce.

Daniel Eduardo Bejarano Bejarano;
2023-01-01

Abstract

The significance of user-generated content as a source for business intelligence and analytics has been on the rise since the inception of electronic commerce platforms and has been solidified in the wake of the pandemic due to the prominence of electronic commerce as a sales channel. The prevailing approach to harnessing unstructured data involves the utilization of Artificial Intelligence; however, there exist simpler alternatives capable of yielding valuable information. This article introduces a methodology grounded in information theory to quantify the semantic disparity between the consumer community and product descriptions. This disparity can result in potential misunderstandings in the dialogue among consumers, and incidental costs in the dialogue between consumers and vendors. One plausible explanation for this disparity is that the terminology employed by consumers may possess different meanings compared to that utilized by product description writers. Our methodology employs large corpora of consumer reviews and product descriptions to quantify this semantic disparity across multiple electronic commerce domains through the implementation of random word exchanges and compression. Furthermore, we utilize neural word embeddings to identify specific words exhibiting the greatest semantic drift between reviews and descriptions, thereby providing lexical examples of these gaps. Our findings indicate that lower levels of lexical-semantic gap are associated with better consumer satisfaction.
2023
Lexical-semantic gap, Customer-vendor communication, Semantic drift, Social commerce language, Product reviews ambiguity, Product description ambiguity
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0020025523005224-main.pdf

accesso aperto

Licenza: Creative commons
Dimensione 846.58 kB
Formato Adobe PDF
846.58 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1123246
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact