we would like to improve the precision of the speaker
detection by combining the content-aware analysis
and eye-tracking data for better presentation of sub-
We thank S. Kawamura, T. Kato and T. Fukusato
(Waseda University, Japan) for their advisory. This
research was supported by JST ACCEL and CREST.
