Attention-Based Fusion of Ultrashort Voice Utterances and Depth Videos for Multimodal Person Identification - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Sensors Année : 2023

Attention-Based Fusion of Ultrashort Voice Utterances and Depth Videos for Multimodal Person Identification

Résumé

Multimodal deep learning, in the context of biometrics, encounters significant challenges due to the dependence on long speech utterances and RGB images, which are often impractical in certain situations. This paper presents a novel solution addressing these issues by leveraging ultrashort voice utterances and depth videos of the lip for person identification. The proposed method utilizes an amalgamation of residual neural networks to encode depth videos and a Time Delay Neural Network architecture to encode voice signals. In an effort to fuse information from these different modalities, we integrate self-attention and engineer a noise-resistant model that effectively manages diverse types of noise. Through rigorous testing on a benchmark dataset, our approach exhibits superior performance over existing methods, resulting in an average improvement of 10%. This method is notably efficient for scenarios where extended utterances and RGB images are unfeasible or unattainable. Furthermore, its potential extends to various multimodal applications beyond just person identification.
Fichier principal
Vignette du fichier
2023_Moufidi_Sensors.pdf (911.78 Ko) Télécharger le fichier
Origine Fichiers éditeurs autorisés sur une archive ouverte
Licence

Dates et versions

hal-04494313 , version 1 (07-03-2024)

Licence

Identifiants

Citer

Abderrazzaq Moufidi, David Rousseau, Pejman Rasti. Attention-Based Fusion of Ultrashort Voice Utterances and Depth Videos for Multimodal Person Identification. Sensors, 2023, 23 (13), pp.5890. ⟨10.3390/s23135890⟩. ⟨hal-04494313⟩
27 Consultations
4 Téléchargements

Altmetric

Partager

Gmail Mastodon Facebook X LinkedIn More