Toward comprehensive short utterances manipulations detection in videos

Abderrazzaq Moufidi; David Rousseau; Pejman Rasti

doi:https://doi.org/10.1007/s11042-024-20284-x

Article Dans Une Revue Multimedia Tools and Applications Année : 2024

Toward comprehensive short utterances manipulations detection in videos

(1) , (1, 2) , (1, 2)

1
2

Abderrazzaq Moufidi

Fonction : Auteur
PersonId : 1361227

Laboratoire Angevin de Recherche en Ingénierie des Systèmes

David Rousseau

Fonction : Auteur
PersonId : 1143745

Laboratoire Angevin de Recherche en Ingénierie des Systèmes

Institut de Recherche en Horticulture et Semences

Pejman Rasti

Fonction : Auteur
PersonId : 1049864

Laboratoire Angevin de Recherche en Ingénierie des Systèmes

Institut de Recherche en Horticulture et Semences

Résumé

In a landscape increasingly populated by convincing yet deceptive multimedia content gen- erated through generative adversarial networks, there exists a significant challenge for both human interpretation and machine learning algorithms. This study introduces a shallow learning technique specifically tailored for analyzing visual and auditory components in videos, targeting the lower face region. Our method is optimized for ultra-short video seg- ments (200-600 ms) and employs wavelet scattering transforms for audio and discrete cosine transforms for video. Unlike many approaches, our method excels at these short durations and scales efficiently to longer segments. Experimental results demonstrate high accuracy, achieving 96.83% for 600 ms audio segments and 99.87% for whole video sequences on the FakeAVCeleb and DeepfakeTIMIT datasets. This approach is computationally efficient, making it suitable for real-world applications with constrained resources. The paper also explores the unique challenges of detecting deepfakes in ultra-short sequences and proposes a targeted evaluation strategy for these conditions.

Mots clés

Deepfake Biometrics Multimodality Late fusion Adversarial attacks Presentation attacks

Domaines

Informatique [cs] Synthèse d'image et réalité virtuelle [cs.GR]

Fichier principal

s11042-024-20284-x.pdf (801.82 Ko)

Origine	Fichiers éditeurs autorisés sur une archive ouverte
licence	Paternité

Pejman Rasti : Connectez-vous pour contacter le contributeur

https://univ-angers.hal.science/hal-04752448

Soumis le : jeudi 24 octobre 2024-16:59:35

Dernière modification le : lundi 28 octobre 2024-17:17:54

Dates et versions

hal-04752448 , version 1 (24-10-2024)

Licence

Paternité

Identifiants

HAL Id : hal-04752448 , version 1
DOI : https://doi.org/10.1007/s11042-024-20284-x

Citer

Abderrazzaq Moufidi, David Rousseau, Pejman Rasti. Toward comprehensive short utterances manipulations detection in videos. Multimedia Tools and Applications, 2024, pp.1-14. ⟨https://doi.org/10.1007/s11042-024-20284-x⟩. ⟨hal-04752448⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-ANGERS LARIS IRHS INRAE LARIS-ISISV

0 Consultations

0 Téléchargements

Toward comprehensive short utterances manipulations detection in videos

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager