EuPathDomains: The Divergent Domain Database for Eukaryotic Pathogens - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Access content directly
Journal Articles Infection, Genetics and Evolution Year : 2011

EuPathDomains: The Divergent Domain Database for Eukaryotic Pathogens

Abstract

Eukaryotic pathogens (e.g. Plasmodium, Leishmania, Trypanosomes, etc.) are a major source of morbidity and mortality worldwide. In Africa, one of the most impacted continents, they cause millions of deaths and constitute an immense economic burden. While the genome sequence of several of these organisms is now available, the biological functions of more than half of their proteins are still unknown. This is a serious issue for bringing to the foreground the expected new therapeutic targets. In this context, the identification of protein domains is a key step to improve the functional annotation of the proteins. However, several domains are missed in eukaryotic pathogens because of the high phylogenetic distance of these organisms from the classical eukaryote models. We recently proposed a method, co-occurrence domain detection (CODD), that improves the sensitivity of Pfam domain detection by exploiting the tendency of domains to appear preferentially with a few other favorite domains in a protein. In this paper, we present EuPathDomains (http://www.atgc-montpellier.fr/EuPathDomains/), an extended database of protein domains belonging to ten major eukaryotic human pathogens. EuPathDomains gathers known and new domains detected by CODD, along with the associated confidence measurements and the GO annotations that can be deduced from the new domains. This database significantly extends the Pfam domain coverage of all selected genomes, by proposing new occurrences of domains as well as new domain families that have never been reported before. For example, with a false discovery rate lower than 20%, EuPathDomains increases the number of detected domains by 13% in Toxoplasma gondii genome and up to 28% in Cryptospordium parvum, and the total number of domain families by 10% in Plasmodium falciparum and up to 16% in C. parvum genome. The database can be queried by protein names, domain identifiers, Pfam or Interpro identifiers, or organisms, and should become a valuable resource to decipher the protein functions of eukaryotic pathogens.
Fichier principal
Vignette du fichier
article100917.pdf (209.45 Ko) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

lirmm-00540932 , version 1 (29-11-2010)

Licence

Identifiers

Cite

Amel Ghouila, Nicolas Terrapon, Olivier Gascuel, Fatma Guerfali, Dhafer Laouini, et al.. EuPathDomains: The Divergent Domain Database for Eukaryotic Pathogens. Infection, Genetics and Evolution, 2011, 11 (4), pp.698-707. ⟨10.1016/j.meegid.2010.09.008⟩. ⟨lirmm-00540932⟩
615 View
403 Download

Altmetric

Share

Gmail Mastodon Facebook X LinkedIn More