A data-mining approach for assessing consistency between multiple representations in spatial databases - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue International Journal of Geographical Information Science Année : 2009

A data-mining approach for assessing consistency between multiple representations in spatial databases

Résumé

When different spatial databases are combined, an important issue is the identification of inconsistencies between data. Quite often, representations of the same geographical entities in databases are different and reflect different points of view. In order to fully take advantage of these differences when object instances are associated, a key issue is to determine whether the differences are normal, i.e. explained by the database specifications, or if they are due to erroneous or outdated data in one database. In this paper, we propose a knowledge-based approach to partially automate the consistency assessment between multiple representations of data. The inconsistency detection is viewed as a knowledge-acquisition problem, the source of knowledge being the data. The consistency assessment is carried out by applying a proposed method called MECO. This method is itself parameterized by some domain knowledge obtained from a second method called MACO. MACO supports two approaches (direct or indirect) to perform the knowledge acquisition using data-mining techniques. In particular, a supervised learning approach is defined to automate the knowledge acquisition so as to drastically reduce the human-domain expert's work. Thanks to this approach, the knowledge-acquisition process is sped up and less expertdependent. Training examples are obtained automatically upon completion of the spatial data matching. Knowledge extraction from data following this bottom-up approach is particularly useful, since the database specifications are generally complex, difficult to analyse, and manually encoded. Such a data-driven process also sheds some light on the gap between textual specifications and those actually used to produce the data. The methodology is illustrated and experimentally validated by comparing geometrical representations and attribute values of different vector spatial databases. The advantages and limits of such partially automatic approaches are discussed, and some future works are suggested.
Fichier principal
Vignette du fichier
Sheeren_10019.pdf (17.93 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-02666370 , version 1 (10-12-2021)

Licence

Paternité

Identifiants

Citer

David Sheeren, S Mustière, J.D. Zucker. A data-mining approach for assessing consistency between multiple representations in spatial databases. International Journal of Geographical Information Science, 2009, 23 (8), pp.961-992. ⟨10.1080/13658810701791949⟩. ⟨hal-02666370⟩
100 Consultations
18 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More