Domain invariant covariate selection (Di-CovSel) for selecting generalized features across domains - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Chemometrics and Intelligent Laboratory Systems Année : 2022

Domain invariant covariate selection (Di-CovSel) for selecting generalized features across domains

Résumé

Multivariate spectral signals are highly correlated. Often, variable selection techniques are deployed, aiming at model optimization, identification of key variables to explore the underlying physicochemical system or development of a cheap multi-spectral system based on key variables. However, many times the selected variables do not supply a good estimate of properties when tested on a new setting such as new measurements performed on a different spectrometer, different physical or chemical state of the samples and difference in the environmental factors around the experiment. Often the model based on variables selected in the first domain (specific conditions/instrument) does not generalize on the new domain (specific conditions/instrument). To deal with it, in the present work a new method to variable selection called domain invariant covariate selection (di-CovSel) is proposed. The method selects the most informative variables which are invariant to the differences in the instruments, physical or chemical state of the samples and the differences in the environmental factors around the experiment. The method is inspired by domain invariant partial least-square (di-PLS) and the covariate selection (CovSel). The potential of the method is demonstrated on four real cases related to the calibration of near-infrared (NIR) spectroscopy on agri-food materials. The results show that in all the cases, the domain invariant features selected by the di-CovSel have low prediction error compared to the standard variable selection with the CovSel approach when the models are tested on a new data domain. In summary, domain invariant features selected across domains support the development of calibration models with good generalization and supply a better understanding of the system by bypassing the external factors originating from differences in the instruments, physical or chemical states of the samples and the differences in the environmental factors around the experiment. Note that one key feature of the proposed method is that the most important variables which generalize well across domains can be identified without requiring reference measurements in the target domain.
Fichier principal
Vignette du fichier
1-s2.0-S0169743922000107-main.pdf (2.97 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-03689237 , version 1 (07-06-2022)

Licence

Paternité

Identifiants

Citer

Valeria Fonseca Diaz, Puneet Mishra, Jean-Michel Roger, Wouter Saeys. Domain invariant covariate selection (Di-CovSel) for selecting generalized features across domains. Chemometrics and Intelligent Laboratory Systems, 2022, 222, pp.104499. ⟨10.1016/j.chemolab.2022.104499⟩. ⟨hal-03689237⟩
9 Consultations
19 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More