Could KeyWord Masking Strategy Improve Language Model? - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Communication Dans Un Congrès Année : 2023

Could KeyWord Masking Strategy Improve Language Model?

Résumé

This paper presents an enhanced approach for adapting a Language Model (LM) to a specific domain, with a focus on Named Entity Recognition (NER) and Named Entity Linking (NEL) tasks. Traditional NER/NEL methods require a large amounts of labeled data, which is time and resource intensive to produce. Unsupervised and semi-supervised approaches overcome this limitation but suffer from a lower quality. Our approach, called KeyWord Masking (KWM), fine-tunes a Language Model (LM) for the Masked Language Modeling (MLM) task in a special way. Our experiments demonstrate that KWM outperforms traditional methods in restoring domain-specific entities. This work is a preliminary step towards developing a more sophisticated NER/NEL system for domain-specific data.
Fichier principal
Vignette du fichier
Could KeyWord Masking Strategy Improve Language Model.pdf (1.26 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Licence : CC BY SA - Paternité - Partage selon les Conditions Initiales

Dates et versions

hal-04173002 , version 1 (15-03-2024)

Licence

Paternité

Identifiants

Citer

Mariya Borovikova, Arnaud Ferré, Robert Bossy, Mathieu Roche, Claire Nédellec. Could KeyWord Masking Strategy Improve Language Model?. The 28th International Conference on Natural Language & Information Systems. NLDB23., Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S., Jun 2023, Derby, United Kingdom. pp.271-284, ⟨10.1007/978-3-031-35320-8_19⟩. ⟨hal-04173002⟩
95 Consultations
11 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More