FRCol: Face Recognition Based Speaker Video Colorization
Rory Ward, John Breslin
2025
Abstract
Automatic video colorization has recently gained attention for its ability to adapt old movies for today’s modern entertainment industry. However, there is a significant challenge: limiting unnatural color hallucination. Generative artificial intelligence often generates erroneous results, which in colorization manifests as unnatural colorizations. In this work, we propose to ground our automatic video colorization system in relevant exemplars by leveraging a face database, which we retrieve from using facial recognition technology. This retrieved exemplar guides the colorization of the latent-diffusion-based speaker video colorizer. We dub our system FRCol. We focus on speakers as humans have evolved to pay particular attention to certain aspects of colorization, with human faces being one of them. We improve the previous state-of-the-art (SOTA) DeOldify by an average of 13% on the standard metrics of PSNR, SSIM, FID, and FVD on the Grid and Lombard Grid datasets. Our user study also consolidates these results where FRCol was preferred to contemporary colorizers 81% of the time.
DownloadPaper Citation
in Harvard Style
Ward R. and Breslin J. (2025). FRCol: Face Recognition Based Speaker Video Colorization. In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP; ISBN 978-989-758-728-3, SciTePress, pages 717-728. DOI: 10.5220/0013306800003912
in Bibtex Style
@conference{visapp25,
author={Rory Ward and John Breslin},
title={FRCol: Face Recognition Based Speaker Video Colorization},
booktitle={Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP},
year={2025},
pages={717-728},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013306800003912},
isbn={978-989-758-728-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP
TI - FRCol: Face Recognition Based Speaker Video Colorization
SN - 978-989-758-728-3
AU - Ward R.
AU - Breslin J.
PY - 2025
SP - 717
EP - 728
DO - 10.5220/0013306800003912
PB - SciTePress