Multimodal Deepfake Detection for Short Videos

Abderrazzaq Moufidi, Abderrazzaq Moufidi, David Rousseau, Pejman Rasti, Pejman Rasti

2024

Abstract

The focus of this study is to address the growing challenge posed by AI-generated, persuasive but often misleading multimedia content, which poses difficulties for both human and machine learning interpretation. Building upon our prior research, we analyze the visual and auditory elements of multimedia to identify multimodal deepfakes, with a specific focus on the lower facial area in video clips. This targeted approach sets our research apart in the complex field of deepfake detection. Our technique is particularly effective for short video clips, lasting from 200 milliseconds to one second, surpassing many current deep learning methods that struggle in this duration. In our previous work, we utilized late fusion for correlating audio and lip movements and developed a novel method for video feature extraction that requires less computational power. This is a practical solution for real-world applications with limited computing resources. By adopting a multi-view strategy, the proposed network can leverage various weaknesses found in deepfake generation, from visual anomalies to motion inconsistencies or issues with jaw positioning, which are common in such content.

Download


Paper Citation


in Harvard Style

Moufidi A., Rousseau D. and Rasti P. (2024). Multimodal Deepfake Detection for Short Videos. In Proceedings of the 4th International Conference on Image Processing and Vision Engineering - Volume 1: IMPROVE; ISBN 978-989-758-693-4, SciTePress, pages 67-73. DOI: 10.5220/0012557300003720


in Bibtex Style

@conference{improve24,
author={Abderrazzaq Moufidi and David Rousseau and Pejman Rasti},
title={Multimodal Deepfake Detection for Short Videos},
booktitle={Proceedings of the 4th International Conference on Image Processing and Vision Engineering - Volume 1: IMPROVE},
year={2024},
pages={67-73},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012557300003720},
isbn={978-989-758-693-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 4th International Conference on Image Processing and Vision Engineering - Volume 1: IMPROVE
TI - Multimodal Deepfake Detection for Short Videos
SN - 978-989-758-693-4
AU - Moufidi A.
AU - Rousseau D.
AU - Rasti P.
PY - 2024
SP - 67
EP - 73
DO - 10.5220/0012557300003720
PB - SciTePress