The state-of-the-art in automated processing of unstructured business documents has evolved from manual labor to advanced AI systems in the span of mere decades. Such systems involve learning techniques, rule or clause sets, neural models – either used alone or in combination – for the extraction to work. As an example, rule-based processes operate on a perceived layout or positioning of the information, whereas model-based frameworks adopt a semantic, and often uninspectable, approach. Verb-Based Semantic Role Labeling (VBSRL) is a novel system presented in a former paper that uses a hybrid foundation to inform the extraction phase via a set of rules modeling natural language. We propose a new VBSRL-based document processing method, aided by valuable and innovative architectural choices, which has been implemented for the Italian language and experimented upon with promising results. Even in its infancy, in fact, the first implementation of this system shows better results than comparable IE solutions, obtaining an aggregate, average F-measure of nearly 79%

Cnosso, a novel method for business document automation based on open information extraction

Claudio Tomazzoli
2024-01-01

Abstract

The state-of-the-art in automated processing of unstructured business documents has evolved from manual labor to advanced AI systems in the span of mere decades. Such systems involve learning techniques, rule or clause sets, neural models – either used alone or in combination – for the extraction to work. As an example, rule-based processes operate on a perceived layout or positioning of the information, whereas model-based frameworks adopt a semantic, and often uninspectable, approach. Verb-Based Semantic Role Labeling (VBSRL) is a novel system presented in a former paper that uses a hybrid foundation to inform the extraction phase via a set of rules modeling natural language. We propose a new VBSRL-based document processing method, aided by valuable and innovative architectural choices, which has been implemented for the Italian language and experimented upon with promising results. Even in its infancy, in fact, the first implementation of this system shows better results than comparable IE solutions, obtaining an aggregate, average F-measure of nearly 79%
2024
Verb-Based Semantic Role Labeling, NLP, Information Extraction, NER, Natural language analysis, Document automation
File in questo prodotto:
File Dimensione Formato  
doi_j_eswa_2023_123038(ExpertSystemsWithApplications).pdf

solo utenti autorizzati

Descrizione: Articolo in formato PDF
Tipologia: Versione dell'editore
Licenza: Copyright dell'editore
Dimensione 1.15 MB
Formato Adobe PDF
1.15 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1117129
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact