Two new extensions of principal component transform to compute a PLS2 model between two wide matrices: PCT-PLS2 and segmented PCT-PLS2
Résumé
The progress of analytical techniques has led to the possibility of acquiring a large number of data for each analysed sample. Moreover, the application of pre-treatment methods such as Contrast greatly increases the number of variables, yielding (very) wide data matrices. The computation of a PLS2 model between two such matrices may be slowed down, or made impossible, because of computer memory problems. The method presented in this article proposes an algorithm to solve this problem. and to enable the computation of a PLS2 model between two matrices containing a large number of variables. To do this, the PLS2 model is computed between the score matrices obtained by a PCA on each original matrix separately. After PLS2, a back-transformation to the original space is possible, and leads to results identical to those which would have been obtained in the original space. The method can be later extended, by segmenting the matrices, and computing the PC transform on each segment, before concatenating all the resulting score matrices and computing the PLS2 model on the obtained matrices.