Corpora with specific entities and relationships annotated are essential to train and evaluate text-mining systems that are developed to extract specific structured information from a large corpus. In this paper we describe an approach where a named-entity recognition system produces a first annotation and annotators revise this annotation using a web-based interface. The agreement figures achieved show that the inter-annotator agreement is much better than the agreement with the system provided annotations. The corpus has been annotated for drugs, disorders, genes and their inter-relationships. For each of the drug-disorder, drug-target, and target-disorder relations three experts have annotated a set of 100 abstracts. These annotated relationships will be used to train and evaluate text-mining software to capture these relationships in texts. (C) 2012 Elsevier Inc. All rights reserved.

The EU-ADR corpus: Annotated drugs, diseases, targets, and their relationships

Trifiro G.;
2012-01-01

Abstract

Corpora with specific entities and relationships annotated are essential to train and evaluate text-mining systems that are developed to extract specific structured information from a large corpus. In this paper we describe an approach where a named-entity recognition system produces a first annotation and annotators revise this annotation using a web-based interface. The agreement figures achieved show that the inter-annotator agreement is much better than the agreement with the system provided annotations. The corpus has been annotated for drugs, disorders, genes and their inter-relationships. For each of the drug-disorder, drug-target, and target-disorder relations three experts have annotated a set of 100 abstracts. These annotated relationships will be used to train and evaluate text-mining software to capture these relationships in texts. (C) 2012 Elsevier Inc. All rights reserved.
2012
Text mining, Corpus development, Machine learning, Adverse drug reactions
File in questo prodotto:
File Dimensione Formato  
EU-ADR Corpus Annotated Drugs, Diseases, Targets, and their Relationships.pdf

non disponibili

Dimensione 357.31 kB
Formato Adobe PDF
357.31 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1039463
Citazioni
  • ???jsp.display-item.citation.pmc??? 30
  • Scopus 98
  • ???jsp.display-item.citation.isi??? 77
social impact