Comparison of common components analysis with principal components analysis and independent components analysis: Application to SPME-GC-MS volatolomic signatures
Résumé
The aim of this work is to compare a novel exploratory chemometrics method, Common Components Analysis (CCA), with Principal Components Analysis (PCA) and Independent Components Analysis (ICA). CCA consists in adapting the multi-block statistical method known as Common Components and Specific Weights Analysis (CCSWA or ComDim) by applying it to a single data matrix, with one variable per block. As an application, the three methods were applied to SPME-GC-MS volatolomic signatures of livers in an attempt to reveal volatile organic compounds (VOCs) markers of chicken exposure to different types of micropollutants. An application of CCA to the initial SPME-GC-MS data revealed a drift in the sample Scores along CC2, as a function of injection order, probably resulting from time-related evolution in the instrument. This drift was eliminated by orthogonalization of the data set with respect to CC2, and the resulting data are used as the orthogonalized data input into each of the three methods. Since the first step in CCA is to norm-scale all the variables, preliminary data scaling has no effect on the results, so that CCA was applied only to orthogonalized SPME-GC-MS data, while, PCA and ICA were applied to the "orthogonalized", "orthogonalized and Pareto-scaled", and "orthogonalized and autoscaled" data. The comparison showed that PCA results were highly dependent on the scaling of variables, contrary to ICA where the data scaling did not have a strong influence. Nevertheless, for both PCA and ICA the clearest separations of exposed groups were obtained after autoscaling of variables. The main part of this work was to compare the CCA results using the orthogonalized data with those obtained with PCA and ICA applied to orthogonalized and autoscaled variables. The clearest separations of exposed chicken groups were obtained by CCA. CCA Loadings also clearly identified the variables contributing most to the Common Components giving separations. The PCA Loadings did not highlight the most influencing variables for each separation, whereas the ICA Loadings highlighted the same variables as did CCA. This study shows the potential of CCA for the extraction of pertinent information from a data matrix, using a procedure based on an original optimisation criterion, to produce results that are complementary, and in some cases may be superior, to those of PCA and ICA.