Yet Another Ranking Function for Automatic Multiword Term Extraction - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Yet Another Ranking Function for Automatic Multiword Term Extraction

Résumé

Term extraction is an essential task in domain knowledge acquisition. We propose two new measures to extract multiword terms from a domain-specific text. The first measure is both linguistic and statistical based. The second measure is graph-based, allowing assessment of the importance of a multiword term of a domain. Existing measures often solve some problems related (but not completely) to term extraction, e.g., noise, silence, low frequency, large-corpora, complexity of the multiword term extraction process. Instead, we focus on managing the entire set of problems, e.g., detecting rare terms and overcoming the low frequency issue. We show that the two proposed measures outperform precision results previously reported for automatic multiword extraction by comparing them with the state-of-the-art reference measures.
Fichier principal
Vignette du fichier
PolTAL2014.pdf (330.4 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

lirmm-01068556 , version 1 (25-09-2014)

Identifiants

Citer

Juan Antonio Lossio-Ventura, Clement Jonquet, Mathieu Roche, Maguelonne Teisseire. Yet Another Ranking Function for Automatic Multiword Term Extraction. 9th International Conference on Natural Language Processing (PolTAL), Sep 2014, Warsaw, Poland. pp.52-64, ⟨10.1007/978-3-319-10888-9_6⟩. ⟨lirmm-01068556⟩
564 Consultations
747 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More