GeospatRE: extraction and geocoding of spatial relation entities in textual documents - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Access content directly
Journal Articles Cartography and Geographic Information Science Year : 2023

GeospatRE: extraction and geocoding of spatial relation entities in textual documents

Abstract

Spatial information extraction from textual documents and its accurate geo-referencing are important steps in epidemiology, with many applications such as outbreak detection and disease surveillance and control. However, inaccuracy in extraction of such geospatial information will result into inaccurate location identification, which in consequence may produce erroneous information for outbreak investigation and disease surveillance. One of the problems is the extraction of geospatial relations associated with spatial entities in the text documents. In order to identify such geospatial relations, we categorized them into three major relations: 1) Level-1, e.g. center, north, south; 2) Level-2, e.g. nearby, border; 3) Level-3, e.g. distance from spatial entities e.g. 30 km, 20 miles, 100 m, etc., respectively. This work introduces a novel approach for extracting and georeferencing spatial information from textual documents for accurate identification of geospatial relations associated with spatial entities to enhance outbreak monitoring and disease surveillance. We propose a two-step methodology: (i) Extraction of geospatial relations associated with spatial entities, using a clause-based approach, and (ii) Geo-referencing of geospatial relations associated with spatial entities in order to identify the polygon regions, using a custom algorithm to slice or derive the geospatial relation regions from the place name and their geospatial relations. The first step is evaluated with a disease news article dataset consisting of event information and obtaining a precision of 0.9, recall of 0.88 and F-Score of 0.88 respectively. The second step entails using a qualitative evaluation of shapes by end-users. Promising results are obtained for the experiments in second step.
Fichier principal
Vignette du fichier
Alam Syed_2023.pdf (13.07 Mo) Télécharger le fichier
Origin Publisher files allowed on an open archive
Licence

Dates and versions

hal-04559453 , version 1 (25-04-2024)

Licence

Identifiers

Cite

Mehtab Alam Syed, Elena Arsevska, Mathieu Roche, Maguelonne Teisseire. GeospatRE: extraction and geocoding of spatial relation entities in textual documents. Cartography and Geographic Information Science, 2023, pp.1-16. ⟨10.1080/15230406.2023.2264753⟩. ⟨hal-04559453⟩
52 View
6 Download

Altmetric

Share

Gmail Mastodon Facebook X LinkedIn More