Biomedical event extraction has been a long-researched topic exhibiting pretty hard-to-tackle characteristics preventing existing solutions to exit the labs. The language understanding method described here represents the first option as a viable industrial solution to enhance the traditional pair-wise relation identification. Events denote multiple, higher-order, associations among two or more interacting bio-entities describing, for example, changes on the state or location of the involved entities. The complexity of event extraction typically calls for multiple classifiers for recognizing event triggers and arguments. As opposed to previous work, we followed a systems thinking approach to model all the sub-tasks in an end-to-end fashion, leading to a faster, joint model which also mitigates error propagation of locally-optimized classifier pipelines. We recast the task as a sequence labeling problem, proposing a novel multi-task deep neural network model with a BERT encoder pre-trained on biomedical texts, and soft-max and a novel multi-label classifier as decoder.
BeeSL: Towards industry-level biomedical event extraction
R. Lombardo
;
2020-01-01
Abstract
Biomedical event extraction has been a long-researched topic exhibiting pretty hard-to-tackle characteristics preventing existing solutions to exit the labs. The language understanding method described here represents the first option as a viable industrial solution to enhance the traditional pair-wise relation identification. Events denote multiple, higher-order, associations among two or more interacting bio-entities describing, for example, changes on the state or location of the involved entities. The complexity of event extraction typically calls for multiple classifiers for recognizing event triggers and arguments. As opposed to previous work, we followed a systems thinking approach to model all the sub-tasks in an end-to-end fashion, leading to a faster, joint model which also mitigates error propagation of locally-optimized classifier pipelines. We recast the task as a sequence labeling problem, proposing a novel multi-task deep neural network model with a BERT encoder pre-trained on biomedical texts, and soft-max and a novel multi-label classifier as decoder.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.