Assessment of repeatability and reproducibility in untargeted LC/MS metabolomics: Beyond the limits of the relative standard deviation
Résumé
INTRODUCTION: Assessing repeatability and reproducibility in untargeted metabolomics is commonly based on parametric dispersion indicators, such as Relative standard deviation (RSD), calculated for each detected metabolite using pool QC samples data (Kirwan et al, 2022). However, their reliability strongly relies on the normality distribution assumption. Knowing that analytical variability is conditional to many sources, the use of such parametric central values is not suitable. The aim was therefore to develop robust indicators for repeatability and reproducibility assessment of LC/MS data in untargeted metabolomics independent of central values and any parametric assumption.
METHODS: Three specific indicators were developed: first, the intra-batch dispersion, based on the mean area of the convex hull of the pool QC samples within each batch; second, the inter-batch dispersion, defined as the gradient of the deviation between batches; and finally, the intra/inter batch dispersion ratio. Statistical characteristics of these indicators, including stability, robustness, and precision, were then evaluated using synthetic data and their relationships with existing common dispersion parameters were explored. Finally, case studies based on existing untargeted LC/MS datasets (n= 200 to 1000 human subjects) were used to illustrate the value of these indicators under real conditions.
RESULTS: Robustness assessment performed on synthetic data revealed a good precision and stability. Relationships between these indicators and common dispersion parameters (Median Absolute Deviation, RSD, Interquartile Range,…) in case studies, revealed different behaviours showing their ability to capture the variability observed either in parametric or non-parametric conditions. Moreover, this exploration showed different structures of sensitivity to analytical variability in annotated metabolites all along the data processing steps. The proposed indicators also allowed a visualisation of the analytical drift in two dimensions.
CONCLUSION: These indicators open the way to a better and more robust assessment of deviation in large untargeted metabolomic studies, but also to improve long term suitability testing.