Authors:
Long Nguyen-Phuoc
1
;
2
;
Renald Gaboriau
2
;
Dimitri Delacroix
2
and
Laurent Navarro
1
Affiliations:
1
Mines Saint- Étienne, University of Lyon, University Jean Monnet, Inserm, U 1059 Sainbiose, Centre CIS, 42023 Saint- Étienne, France
;
2
MJ Lab, MJ INNOV, 42000 Saint-Etienne, France
Keyword(s):
Cognitive Load Assessment, Multimodal-Multitask Learning, Multihead Attention.
Abstract:
This paper introduces the M&M model, a novel multimodal-multitask learning framework, applied to the AVCAffe dataset for cognitive load assessment (CLA). M&M uniquely integrates audiovisual cues through a dual-pathway architecture, featuring specialized streams for audio and video inputs. A key innovation lies in its cross-modality multihead attention mechanism, fusing the different modalities for synchronized multitasking. Another notable feature is the model’s three specialized branches, each tailored to a specific cognitive load label, enabling nuanced, task-specific analysis. While it shows modest performance compared to the AVCAffe’s single-task baseline, M&M demonstrates a promising framework for integrated multimodal processing. This work paves the way for future enhancements in multimodal-multitask learning systems, emphasizing the fusion of diverse data types for complex task handling.