Domain invariant covariate selection (Di-CovSel) for selecting generalized features across domains - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Access content directly
Journal Articles Chemometrics and Intelligent Laboratory Systems Year : 2022

Domain invariant covariate selection (Di-CovSel) for selecting generalized features across domains

Abstract

Multivariate spectral signals are highly correlated. Often, variable selection techniques are deployed, aiming at model optimization, identification of key variables to explore the underlying physicochemical system or development of a cheap multi-spectral system based on key variables. However, many times the selected variables do not supply a good estimate of properties when tested on a new setting such as new measurements performed on a different spectrometer, different physical or chemical state of the samples and difference in the environmental factors around the experiment. Often the model based on variables selected in the first domain (specific conditions/instrument) does not generalize on the new domain (specific conditions/instrument). To deal with it, in the present work a new method to variable selection called domain invariant covariate selection (di-CovSel) is proposed. The method selects the most informative variables which are invariant to the differences in the instruments, physical or chemical state of the samples and the differences in the environmental factors around the experiment. The method is inspired by domain invariant partial least-square (di-PLS) and the covariate selection (CovSel). The potential of the method is demonstrated on four real cases related to the calibration of near-infrared (NIR) spectroscopy on agri-food materials. The results show that in all the cases, the domain invariant features selected by the di-CovSel have low prediction error compared to the standard variable selection with the CovSel approach when the models are tested on a new data domain. In summary, domain invariant features selected across domains support the development of calibration models with good generalization and supply a better understanding of the system by bypassing the external factors originating from differences in the instruments, physical or chemical states of the samples and the differences in the environmental factors around the experiment. Note that one key feature of the proposed method is that the most important variables which generalize well across domains can be identified without requiring reference measurements in the target domain.
Fichier principal
Vignette du fichier
1-s2.0-S0169743922000107-main.pdf (2.97 Mo) Télécharger le fichier
Origin : Publisher files allowed on an open archive

Dates and versions

hal-03689237 , version 1 (07-06-2022)

Licence

Attribution

Identifiers

Cite

Valeria Fonseca Diaz, Puneet Mishra, Jean-Michel Roger, Wouter Saeys. Domain invariant covariate selection (Di-CovSel) for selecting generalized features across domains. Chemometrics and Intelligent Laboratory Systems, 2022, 222, pp.104499. ⟨10.1016/j.chemolab.2022.104499⟩. ⟨hal-03689237⟩
9 View
19 Download

Altmetric

Share

Gmail Facebook X LinkedIn More