The classification of documents is an interesting topic of recent terminological investigations, in particular the technological ones. Some sophisticated techniques have been developed which provide the classification based upon the recognition of specific linguistic features, such as specific terms or occurrences of phrases. A limited number of cases exist of real document classification applications that make use of natural language processing techniques providing both statistical analysis and human supervision, where the system fully automates the classification process, but the instruction of the taxonomy is a totally human centred activity. In this paper we focus on an application with the above mentioned features; we then introduce a methodology that makes use of this application. The fundamental argument in favour of a specific methodology is that the analysis which leads to the deployment of the term 'taxonomy' can be seen as an ontology construction: we also discuss this aspect as a general motivation.

Supervised Document Classification based upon domain-specific Term Taxonomies

BELLOMI, Francesco;CRISTANI, Matteo
2006-01-01

Abstract

The classification of documents is an interesting topic of recent terminological investigations, in particular the technological ones. Some sophisticated techniques have been developed which provide the classification based upon the recognition of specific linguistic features, such as specific terms or occurrences of phrases. A limited number of cases exist of real document classification applications that make use of natural language processing techniques providing both statistical analysis and human supervision, where the system fully automates the classification process, but the instruction of the taxonomy is a totally human centred activity. In this paper we focus on an application with the above mentioned features; we then introduce a methodology that makes use of this application. The fundamental argument in favour of a specific methodology is that the analysis which leads to the deployment of the term 'taxonomy' can be seen as an ontology construction: we also discuss this aspect as a general motivation.
document classification
taxonomy
ontology
statistical natural language processing.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/21006
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact