Matching heterogeneous textual data using spatial features

J. Fize; M. Roche; Maguelonne Teisseire

doi:10.1109/ICDMW.2018.00197

Communication Dans Un Congrès Année : 2019

Matching heterogeneous textual data using spatial features

(1) , (1) , (1)

J. Fize

Fonction : Auteur

Territoires, Environnement, Télédétection et Information Spatiale

M. Roche

Fonction : Auteur
PersonId : 4967
IdHAL : mathieu-roche
ORCID : 0000-0003-3272-8568
IdRef : 09042087X

Territoires, Environnement, Télédétection et Information Spatiale

Maguelonne Teisseire

Fonction : Auteur
PersonId : 8645
IdHAL : maguelonne-teisseire
ORCID : 0000-0001-9313-6414
IdRef : 117436593

Territoires, Environnement, Télédétection et Information Spatiale

Résumé

An increasing amount of textual data is made available through different medium (e.g., social networks, company, data catalog, etc.). These new resources are highly heterogeneous, thus new methods are needed to extract information. Here, we propose a text matching process based on spatial features and compatible with heterogeneous textual data. Besides being compatible with heterogeneous data, we introduce two contributions. First, to be compared, spatial information is extracted then stored in a dedicated representation: STR, or Spatial Textual Representation. Second, to improve the approximation of the spatial similarity, we propose two transformations to apply on STR. To support our contributions, we evaluate the different aspects of the process using two corpora, including one corpus that is highly heterogeneous. Results obtained on both corpora demonstrate that relevant spatial matches can be obtained between the most similar STRs with an improvement due to STR transformation.

Mots clés

TEXT MATCHING SPATIAL SIMILARITY HETEROGENEOUS DATA

Domaines

Sciences de l'environnement

Migration Irstea Publications : Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-02609203

Soumis le : samedi 16 mai 2020-17:23:36

Dernière modification le : mardi 26 mars 2024-11:51:23

Dates et versions

hal-02609203 , version 1 (16-05-2020)

Identifiants

HAL Id : hal-02609203 , version 1
DOI : 10.1109/ICDMW.2018.00197
IRSTEA : PUB00061142

Citer

J. Fize, M. Roche, Maguelonne Teisseire. Matching heterogeneous textual data using spatial features. ICDMW 2018: 18th IEEE International Conference on Data Mining Workshops, Nov 2018, Singapore, Singapore. pp.1389-1396, ⟨10.1109/ICDMW.2018.00197⟩. ⟨hal-02609203⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CIRAD AGROPARISTECH CNRS IRSTEA AGROPOLIS TETIS INRAE INRAEOCCITANIEMONTPELLIER MATHNUM

14 Consultations

0 Téléchargements

Matching heterogeneous textual data using spatial features

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager