The task of Named Entity Recognition (NER) is aimed at identifying named entities in a given text and classifying them into pre-defined domain entity types such as persons, organizations, locations. Most of the existing NER systems make use of generic entity type classification schemas, however, the comparison and integration of (more or less) different entity types among different NER systems is a complex problem even for human experts. In this paper, we propose a supervised approach called L2AWE (Learning To Adapt with Word Embeddings) which aims at adapting a NER system trained on a source classification schema to a given target one. In particular, we validate the hypothesis that the embedding representation of named entities can improve the semantic meaning of the feature space used to perform the adaptation from a source to a target domain. The results obtained on benchmark datasets of informal text show that L2AWE not only outperforms several state of the art models, but it is also able to tackle errors and uncertainties given by NER systems.

LearningToAdapt with word embeddings: domain adaptation of named entity recognition systems

Nozza, Debora
;
2021-01-01

Abstract

The task of Named Entity Recognition (NER) is aimed at identifying named entities in a given text and classifying them into pre-defined domain entity types such as persons, organizations, locations. Most of the existing NER systems make use of generic entity type classification schemas, however, the comparison and integration of (more or less) different entity types among different NER systems is a complex problem even for human experts. In this paper, we propose a supervised approach called L2AWE (Learning To Adapt with Word Embeddings) which aims at adapting a NER system trained on a source classification schema to a given target one. In particular, we validate the hypothesis that the embedding representation of named entities can improve the semantic meaning of the feature space used to perform the adaptation from a source to a target domain. The results obtained on benchmark datasets of informal text show that L2AWE not only outperforms several state of the art models, but it is also able to tackle errors and uncertainties given by NER systems.
2021
Nozza, Debora; Manchanda, Pikakshi; Fersini, Elisabetta; Palmonari, Matteo; Messina, Enza
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0306457321000455-main.pdf

non disponibili

Descrizione: article
Tipologia: Pdf editoriale (Publisher's layout)
Licenza: Copyright dell'editore
Dimensione 1.17 MB
Formato Adobe PDF
1.17 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11565/4051966
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 23
  • ???jsp.display-item.citation.isi??? 14
social impact