A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement
Journal Articles IEEE Transactions on Neural Networks and Learning Systems Year : 2017

A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data

Abstract

In this paper we introduce a new approach of semi-supervised anomaly detection that deals with categorical data. Given a training set of instances (all belonging to the normal class), we analyze the relationships among features for the extraction of a discriminative characterization of the anomalous instances. Our key idea is to build a model characterizing the features of the normal instances and then use a set of distance-based techniques for the discrimination between the normal and the anomalous instances. We compare our approach with the state-of-the-art methods for semi-supervised anomaly detection. We empirically show that a specifically designed technique for the management of the categorical data outperforms the general-purpose approaches. We also show that, in contrast with other approaches that are opaque because their decision cannot be easily understood, our proposal produces a discriminative model that can be easily interpreted and used for the exploration of the data.
Fichier principal
Vignette du fichier
tnnls.pdf (537.04 Ko) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

lirmm-01275509 , version 1 (17-02-2016)

Identifiers

Cite

Dino Ienco, Ruggero Pensa, Rosa Meo. A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28 (5), pp.1017-1029. ⟨10.1109/TNNLS.2016.2526063⟩. ⟨lirmm-01275509⟩
402 View
709 Download

Altmetric

Share

More