GeospatRE: extraction and geocoding of spatial relation entities in textual documents - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Cartography and Geographic Information Science Année : 2023

GeospatRE: extraction and geocoding of spatial relation entities in textual documents

Résumé

Spatial information extraction from textual documents and its accurate geo-referencing are important steps in epidemiology, with many applications such as outbreak detection and disease surveillance and control. However, inaccuracy in extraction of such geospatial information will result into inaccurate location identification, which in consequence may produce erroneous information for outbreak investigation and disease surveillance. One of the problems is the extraction of geospatial relations associated with spatial entities in the text documents. In order to identify such geospatial relations, we categorized them into three major relations: 1) Level-1, e.g. center, north, south; 2) Level-2, e.g. nearby, border; 3) Level-3, e.g. distance from spatial entities e.g. 30 km, 20 miles, 100 m, etc., respectively. This work introduces a novel approach for extracting and georeferencing spatial information from textual documents for accurate identification of geospatial relations associated with spatial entities to enhance outbreak monitoring and disease surveillance. We propose a two-step methodology: (i) Extraction of geospatial relations associated with spatial entities, using a clause-based approach, and (ii) Geo-referencing of geospatial relations associated with spatial entities in order to identify the polygon regions, using a custom algorithm to slice or derive the geospatial relation regions from the place name and their geospatial relations. The first step is evaluated with a disease news article dataset consisting of event information and obtaining a precision of 0.9, recall of 0.88 and F-Score of 0.88 respectively. The second step entails using a qualitative evaluation of shapes by end-users. Promising results are obtained for the experiments in second step.
Fichier principal
Vignette du fichier
Alam Syed_2023.pdf (13.07 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Licence : CC BY - Paternité

Dates et versions

hal-04559453 , version 1 (25-04-2024)

Licence

Paternité

Identifiants

Citer

Mehtab Alam Syed, Elena Arsevska, Mathieu Roche, Maguelonne Teisseire. GeospatRE: extraction and geocoding of spatial relation entities in textual documents. Cartography and Geographic Information Science, 2023, pp.1-16. ⟨10.1080/15230406.2023.2264753⟩. ⟨hal-04559453⟩
0 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More