KdMutual: A novel clustering algorithm combining mutual neighboring and hierarchical approaches using a new selection criterion
Résumé
New clustering algorithms are expected to manage complex data, meaning various shapes and densities while being user friendly. This work addresses this challenge. A new clustering algorithm KdMutual(1) driven by the number of clusters is proposed. The idea behind the algorithm is based on the assumption that working with cluster cores rather than considering frontiers makes the clustering process easier. KdMutual is based on three steps: The first one aims at identifying the potential core clusters. It relies on mutual neighborhood and includes specific mechanisms to identify and preserve potential core clusters. The second step is based on a constrained hierarchical process that deals with noise. In the last step the potential clusters are selected using a specific ranking criterion and the final partition is built. KdMutual combines the best characteristics of density peaks and connectivity-based approaches. It is capable of detecting the non-presence of natural clusters. Tests were carried out to compare the proposal with 14 other clustering algorithms. Using 2-dimensional benchmark datasets of various shapes and densities they showed that KdMutual was highly effective in matching a ground truth target. It also proved efficient in high dimensions when clusters are well separated. Moreover, it is able to identify clusters of various densities, partially overlapping and including a large amount of noise within spaces of moderate dimension.
Domaines
Informatique [cs]Origine | Fichiers produits par l'(les) auteur(s) |
---|