Is Prompting What Term Extraction Needs? - Laboratoire LI, équipe BDTLN
Communication Dans Un Congrès Année : 2024

Is Prompting What Term Extraction Needs?

Résumé

Automatic term extraction (ATE) is a natural language processing (NLP) task that reduces the effort of manually identifying terms from domain-specific corpora by providing a list of candidate terms. This paper summarizes our research on the applicability of open and closedsourced large language models (LLMs) on the ATE task compared to two benchmarks where we consider ATE as sequence-labeling (iobATE ) and seq2seq ranking (templATE ) tasks, respectively. We propose three forms of prompting designs, including (1) sequence-labeling response; (2) text-extractive response; and (3) filling the gap of both types by textgenerative response. We conduct experiments on the ACTER corpora in three languages and four domains with two different gold standards: one includes only terms (ANN) and the other covers both terms and entities (NES). Our empirical inquiry unveils that above all the prompting formats, text-extractive responses, and text-generative responses exhibit a greater ability in the few-shot setups when the amount of training data is scarce, and surpasses the performance of the templATE classifier in all scenarios. The performance of LLMs is close to fully supervised sequence-labeling ones, and it offers a valuable trade-off by eliminating the need for extensive data annotation efforts to a certain degree. This demonstrates LLMs' potential use within pragmatic, real-world applications characterized by the constricted availability of labeled examples.

Fichier principal
Vignette du fichier
TSD___Camera_Ready___1192.pdf (828.72 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)
licence

Dates et versions

hal-04742439 , version 1 (17-10-2024)

Licence

Identifiants

Citer

Hanh Thi Hong Tran, Carlos-Emiliano González-Gallardo, Julien Delaunay, Antoine Doucet, Senja Pollak. Is Prompting What Term Extraction Needs?. 27th International Conference, TSD 2024, Sep 2024, Brno, Czech Republic. pp.17-29, ⟨10.1007/978-3-031-70563-2_2⟩. ⟨hal-04742439⟩
82 Consultations
33 Téléchargements

Altmetric

Partager

More