Deep semi-supervised clustering for multi-variate time-series - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Neurocomputing Année : 2022

Deep semi-supervised clustering for multi-variate time-series

Résumé

Huge amount of data are nowadays produced by a large and disparate family of sensors, which typically measure multiple variables over time. Such rich information can be profitably organized as multivariate time-series. Collect enough labelled samples to set up supervised analysis for such kind of data is challenging while a reasonable assumption is to dispose of a limited background knowledge that can be injected in the analysis process. In this context, semi-supervised clustering methods represent a well suited tool to get the most out of such reduced amount of knowledge. With the aim to deal with multivariate time-series analysis under a limited background knowledge setting, we propose a semi-supervised (constrained) deep embedding time-series clustering framework that exploits knowledge supervision modeled as Must-and Cannot-link constraints. More in detail, our proposal, named conDetSEC (constrained Deep embedding time SEries Clustering), is based on Gated Recurrent Units (GRUs) with the aim to explicitly manage the temporal dimension associated to multi-variate time series data. conDetSEC implements a procedure in which an embedding generation step is combined with a clustering refinement step. Both steps exploit the small amount of available knowledge provided by Must-and Cannot-link constraints. More specifically, during the data embedding generation the constraints are used by jointly optimizing the network parameters via both unsupervised and semi-supervised tasks, while at the refinement step they are used in conjunction with the goal to stretch the embedding manifold towards the clustering centroids to recover a more clear cluster structure. Experimental evaluation on real-world benchmarks coming from diverse domains has highlighted the effectiveness of our proposal in comparison with state-of-the-art unsupervised and semi-supervised time-series clustering methods.
Fichier principal
Vignette du fichier
Ienco-Dino.pdf (2.59 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03836592 , version 1 (02-11-2022)

Licence

Paternité

Identifiants

Citer

Dino Ienco, Roberto Interdonato. Deep semi-supervised clustering for multi-variate time-series. Neurocomputing, 2022, 516, pp.36 - 47. ⟨10.1016/j.neucom.2022.10.033⟩. ⟨hal-03836592⟩
58 Consultations
89 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More