Accéder directement au contenu Accéder directement à la navigation

Using structure recurrence to define protein domains

Abstract : Domains are basic units of protein structure and essential for exploring protein fold space and structure evolution. With the NIH Protein Structure Initiative and other structural genomics initiatives worldwide, the number of protein structures in PDB is increasing dramatically and domain parsing needs to be done automatically. Most of the existing structural domain parsing programsconsider the compactness of the domains and/or the number and strength of internal (intra-domain) versus external (inter-domain) contacts. Here we present a completely different approach. Taking advantage of the growing number of known structures in the PDB, the chains are parsed solely by using recurrence of similar structures that appear in the structural database. A non-redundant set of 6373 protein chains was selected as the target data set and 128 benchmark chains from pDomains were used as query chains. For each query chain, one against all target structure comparisons were performed using VAST. Then the VAST cliques were collected and the protein residues were clustered using mathematical procedures akin to those used for analyzing the microarray data. These clusters define domains. NDO scores were used to compare the results with SCOP and CATH domain boundaries as well as with those from other parsing programs. Our algorithm gave results that were comparable to those of several existing programs. It handles segmented domains equally well as non-segmented domains. The structures that contribute the cliques that define a domain may contain distant evolutionary information of the domain.
Liste complète des métadonnées
Déposant : Migration Prodinra <>
Soumis le : jeudi 4 juin 2020 - 03:00:20
Dernière modification le : vendredi 3 juillet 2020 - 19:30:46


Fichiers éditeurs autorisés sur une archive ouverte


  • HAL Id : hal-02758033, version 1
  • PRODINRA : 49824
  • WOS : 000208762004275



Chin-Hsien Tai, Sam Vichetra, Jean-François Gibrat, Peter Munson, Byungkook Lee, et al.. Using structure recurrence to define protein domains. Biophysical Society 54th Annual Meeting, Feb 2010, San Francisco, United States. pp.CD, 2010, acte of Biophysical Society 54th Annual Meeting. ⟨hal-02758033⟩



Consultations de la notice


Téléchargements de fichiers