N-CovSel, a new strategy for feature selection in N-way data - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Analytica Chimica Acta Année : 2022

N-CovSel, a new strategy for feature selection in N-way data

Résumé

In data analysis, how to select meaningful variables is a hot and wide-debated topic, and several variable selection (or feature reduction) approaches have been proposed in the literature. Although feature selection methods are numerous, most of them are suitable for data matrices, but not for higher order structures. This is mainly due to the fact the assessment of the relevancy of variables in a multi-way context has not been extensively discussed. To the best of our knowledge, among variable selection approaches developed for standard 2-way data arrays, only VIP analysis and selectivity ratio have been extended to higher-order structures. This aspect is not given by an irrelevance of the topic; on the contrary, the possibility of selecting information in a complex data set such as a multi-way structure is crucial. In the light of these considerations, the present paper discusses a feature selection strategy for N-way data based on the Covariance Selection (CovSel) approach, thus called N-CovSel. This method allows the selection of features of different dimensionality (from 1- up to (N-1)-way), depending on the nature of the original data array. The novel method has been applied on a simulated data set, in order to inspect its ability in selecting features compatible with the ground truth of the system, and on a real data set. In both cases, N-CovSel has demonstrated to be able to select meaningful features. Eventually, different strategies for the further analysis of the selected features have been proposed; some, based on sequential multi-block methods, providing a further data reduction, and some, N-PLS-based, respecting the multi-way nature of the data.

Dates et versions

hal-03836791 , version 1 (02-11-2022)

Identifiants

Citer

Alessandra Biancolillo, Jean-Michel Roger, Federico Marini. N-CovSel, a new strategy for feature selection in N-way data. Analytica Chimica Acta, 2022, 1231, pp.340433. ⟨10.1016/j.aca.2022.340433⟩. ⟨hal-03836791⟩
18 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More