Extracting new spatial entities and relations from short messages - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Communication Dans Un Congrès Année : 2016

Extracting new spatial entities and relations from short messages

Extraction de nouvelles entités spatiales et relations à partir de messages courts

Résumé

In the past few years, texts have become an important spatial data resource, in addition to maps, satellite images and GPS. Electronic written texts used in mediated interactions, especially short messages (SMS, tweets, etc.), have triggered the emergence of new ways of writing. Extracting information from such short messages, which represent a rich source of information and opinion, is highly important due to the new and challenging text style. Short messages are, however, difficult to analyze because of their brief, unstructured and informal nature. The work presented in this paper is aimed at extracting spatial information from two authentic corpora of SMS and tweets in French in order to take advantage of the vast amount of geographical knowledge expressed in diverse natural language texts. We propose a process in which, firstly, we extract new spatial entities (e.g. Monpelier, Montpel are associated with the place name Montpellier). Secondly, we identify new spatial relations that precede these spatial entities (e.g. sur, par, etc.). Finally, we propose a general pattern for discovering spatial relations (e.g. SR + Preposition). The task is very challenging and complex due to the specificity of short messages language, which is based on weakly standardized modes of writing (lexical creation, massive use of abbreviations, textual variants, etc.). The experiments that were carried out on the two corpora 88milSMS and Tweets highlight the efficiency of our proposed strategy for identifying new kinds of spatial entities and relations. © 2016 ACM.
Fichier non déposé

Dates et versions

hal-02606013 , version 1 (16-05-2020)

Identifiants

Citer

S. Zenasni, E. Kergosien, M. Roche, Maguelonne Teisseire. Extracting new spatial entities and relations from short messages. MEDES 2016, Nov 2016, Biarritz, France. pp.189-196, ⟨10.1145/3012071.3012079⟩. ⟨hal-02606013⟩
36 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More