Version is listed 1 to 2, where version 1 corresponds
to the video summary, and version 2 to the full video.
The overall scene is 1 and the attention scene - the
area around the slides and the speaker is 2. Measure-
ments obtained include: number of fixations, mean
length of fixations, total sum of fixation lengths, per-
centage of time fixated per scene, fixation count and
number of fixations per second.
Again, from Table 1, we can see that participants
consistently spend a higher proportion of the time fix-
ating on the scene for summaries than for the full pre-
sentation video. This is repeated to an even larger
extent for Fixation Counts, where this figure is con-
sistently higher for summaries than for full presenta-
tions. Again, this is evidence of increased levels of
participant engagement for video summaries than for
full presentation videos.
We can see that the number of fixations per second
is consistently higher for video summaries, while the
mean fixation length is consistently shorter for sum-
maries. As previous work has shown, an increased
number of shorter fixations is consistent with higher
cognitive activity (attention), while a reduced num-
ber of longer fixations is consistent with lower atten-
tion (Rayner and Sereno, 1994). This shows that all
video summaries attract higher attention levels of par-
ticipants for summaries than for full presentations.
Table 2 shows a statistically significant (p <0.05)
difference between summary and full versions of
video 1, for the number of fixations per 100 sec-
onds. These results indicate that video 1 summary
is more engaging than the full presentation for video
1. For video 2, statistically significant differences (p
<0.05) are observed in the average fixation duration
per scene, and to a lesser, not statistically significant,
extent in the fixation count per 100 seconds. Partici-
pants still spend a higher proportion of their time fix-
ating on the attention scene for summaries than for
full presentations.
Video 3 results show a large difference between
the two scene’s of the video, there is a statistically sig-
nificant (p <0.1) difference in the percentage of time
spent fixating on the attention scene during the sum-
mary compared with full presentation. Video 4 shows
a statistically significant difference between the sum-
mary and full versions, for the number of fixations
per 100 seconds (p <0.05). This video also shows
a statistically significant difference (p <0.1) for the
mean fixation duration between full presentation and
the presentation summary. This indicates that users
found there to be a much higher concentration of new
information during the summary than the full version.
These differences can be inspected further by looking
back to Table 1.
5.2 Gaze Plots
In this section, we show gaze plots from our eye-
tracking study. Gaze plots are data visualisations
which can communicate important aspects of visual
behaviour clearly. By looking carefully at plots for
full and summary videos, the difference in attention
and focus for different video types becomes more
clearly defined. For each video, 4 representative gaze
plots are chosen, 2 on top from full presentations, and
2 below from summaries.
From the representative gaze plots in Figure 1 we
can see that participants hold much higher levels of at-
tention during summaries than for full presentations,
with far less instances of them losing focus or look-
ing around the scene, instead focussing entirely on the
slides and speaker. The many small circles over the
slides area represent a large number of smaller fixa-
tions - indicating high engagement.
From the representative plots in Figure 2 we see
large improvements in summaries over the full pre-
sentations. While participants still lose focus on occa-
sion, and improvements from full presentations is not
as refined as for the previous video, large improve-
ments are still gained, with the vast majority of fixa-
tions taking place over the presentation slides and the
speaker. For comparison, gaze plots for the full video
shows that fixations tended to be quite dispersed.
From the representative plots in Figure 3 we see
how the number of occasions on which participants
lose focus is reduced, with a big improvement on full
presentations. Gaze plots show the difference for this
video much better than the statistical tests in the pre-
vious section do. For full presentations, fixations are
very dispersed with large numbers of fixations away
from the slides and speakers. Summaries show a
large improvement with a much reduced number of
instances of participants losing focus.
From the representative plots in Figure 4 we
can see that while summaries are imperfect, with in-
stances of participants losing attention, huge improve-
ments in attention and focus are made, although this
may depend on how engaging the videos were in the
first place. While summaries for Video 4 (speechRT6)
still show some instances of participants losing focus,
the original full presentation was found to be the least
engaging video of the dataset. This is also noticeable
from gaze plots. The gaze plots show a high number
of fixations away form the slides and presenters. Gaze
plots of summaries also show smaller fixations than
full presentation gaze plots, which indicates higher
levels of engagement for presentation summaries, in
addition to the obvious position of these fixations tak-
ing place predominantly over the slides and speakers.
HUCAPP 2018 - International Conference on Human Computer Interaction Theory and Applications
68