3DSAL: An Efficient 3D-CNN Architecture for Video Saliency Prediction

Yasser Abdelaziz Dahou Djilali; Yasser Abdelaziz Dahou Djilali; Mohamed Sayah; Kevin McGuinness; Noel E. O’Connor

doi:10.5220/0008875600270036

3DSAL: An Efficient 3D-CNN Architecture for Video Saliency Prediction

Yasser Abdelaziz Dahou Djilali, Yasser Abdelaziz Dahou Djilali, Mohamed Sayah, Kevin McGuinness, Noel E. O’Connor

2020

Abstract

In this paper, we propose a novel 3D CNN architecture that enables us to train an effective video saliency prediction model. The model is designed to capture important motion information using multiple adjacent frames. Our model performs a cubic convolution on a set of consecutive frames to extract spatio-temporal features. This enables us to predict the saliency map for any given frame using past frames. We comprehensively investigate the performance of our model with respect to state-of-the-art video saliency models. Experimental results on three large-scale datasets, DHF1K, UCF-SPORTS and DAVIS, demonstrate the competitiveness of our approach.

Download

Paper Citation

in Harvard Style

Djilali Y., Sayah M., McGuinness K. and O’Connor N. (2020). 3DSAL: An Efficient 3D-CNN Architecture for Video Saliency Prediction. In Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 4: VISAPP; ISBN 978-989-758-402-2, SciTePress, pages 27-36. DOI: 10.5220/0008875600270036

in Bibtex Style

@conference{visapp20,
author={Yasser Abdelaziz Dahou Djilali and Mohamed Sayah and Kevin McGuinness and Noel E. O’Connor},
title={3DSAL: An Efficient 3D-CNN Architecture for Video Saliency Prediction},
booktitle={Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 4: VISAPP},
year={2020},
pages={27-36},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008875600270036},
isbn={978-989-758-402-2},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 4: VISAPP
TI - 3DSAL: An Efficient 3D-CNN Architecture for Video Saliency Prediction
SN - 978-989-758-402-2
AU - Djilali Y.
AU - Sayah M.
AU - McGuinness K.
AU - O’Connor N.
PY - 2020
SP - 27
EP - 36
DO - 10.5220/0008875600270036
PB - SciTePress