Transformation and model choice for RNA-seq co-expression analysis

Andrea Rau; Cathy Maugis Rabusseau

doi:10.1093/bib/bbw128

Article Dans Une Revue Briefings in Bioinformatics Année : 2018

Transformation and model choice for RNA-seq co-expression analysis

(1, 2) , (3)

1
2
3

Andrea Rau

Fonction : Auteur correspondant
PersonId : 744212
IdHAL : andrea-rau
ORCID : 0000-0001-6469-488X
IdRef : 196132118

Connectez-vous pour contacter l'auteur

Génétique Animale et Biologie Intégrative

Université Paris Saclay (COmUE)

Cathy Maugis Rabusseau

Fonction : Auteur
PersonId : 15433
IdHAL : cathy-maugis-rabusseau
ORCID : 0009-0006-3060-8481
IdRef : 130874329

Institut de Mathématiques de Toulouse UMR5219

Résumé

Although a large number of clustering algorithms have been proposed to identify groups of co-expressed genes from microarray data, the question of if and how such methods may be applied to RNA sequencing (RNA-seq) data remains unaddressed. In this work, we investigate the use of data transformations in conjunction with Gaussian mixture models for RNA-seq co-expression analyses, as well as a penalized model selection criterion to select both an appropriate transformation and number of clusters present in the data. This approach has the advantage of accounting for per-cluster correlation structures among samples, which can be strong in RNA-seq data. In addition, it provides a rigorous statistical framework for parameter estimation, an objective assessment of data transformations and number of clusters and the possibility of performing diagnostic checks on the quality and homogeneity of the identified clusters. We analyze four varied RNA-seq data sets to illustrate the use of transformations and model selection in conjunction with Gaussian mixture models. Finally, we propose a Bioconductor package coseq (co-expression of RNA-seq data) to facilitate implementation and visualization of the recommended RNA-seq co-expression analyses.

Mots clés

RNA-seq co-expression

mixture models data transformation

Domaines

Sciences du Vivant [q-bio] Génétique Génétique animale Statistiques [stat]

Migration ProdInra : Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-02624483

Soumis le : mardi 26 mai 2020-11:51:47

Dernière modification le : mercredi 30 octobre 2024-18:17:43

Dates et versions

hal-02624483 , version 1 (26-05-2020)

Identifiants

HAL Id : hal-02624483 , version 1
DOI : 10.1093/bib/bbw128
PRODINRA : 413088
PUBMED : 28065917
WOS : 000432676200006

Citer

Andrea Rau, Cathy Maugis Rabusseau. Transformation and model choice for RNA-seq co-expression analysis. Briefings in Bioinformatics, 2018, 19 (3), pp.425-436. ⟨10.1093/bib/bbw128⟩. ⟨hal-02624483⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

AGROPARISTECH UNIV-TLSE2 CNRS INSA-TOULOUSE INRA INSMI IMT UT1-CAPITOLE UNIV-PARIS-SACLAY INSA-GROUPE INRAE GENETIQUE_ANIMALE ANR GS-BIOSPHERA UNIV-UT3 UT3-TOULOUSEINP GABI

200 Consultations

0 Téléchargements

Transformation and model choice for RNA-seq co-expression analysis

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager