Our method is therefore to consider the experimental scanpaths and for each partic-
ipant’s fixation to predict whether the paragraph would be abandoned or not. A very
good model would predict an abandon at the same time the participant stopped reading.
A bad model would abandon too early or too late.
Paragraphs can be examined several times by participants during a trial, but we
restricted our analysis to first visits of the current paragraph. It is also worth noting
that the previous paragraph is not necessary on the same stimuli page as the current
paragraph. It could have been seen on the previous stimuli page. That is for instance the
case of the left paragraph of Fig. 2 which has been processed with another paragraph in
mind, seen on the previous stimuli page.
3.1 Modeling Semantic Judgments
Such a decision making model on paragraphs needs to be based on a model of semantic
memory that would be able to mimic human judgments of semantic associations. We
used LSA to dynamically compute the semantic similarities between the goal and each
set of words that are supposed to have been fixated.
We assumed a linear exploration of words, although we know that this is not exactly
the case in information search (Chanceaux et al., [3]).
3.2 Effect of the Prior Paragraph
The relatedness of the prior paragraph to the goal may play a role in the way the current
paragraph is processed. We suspected that if the prior paragraph is not related to the
goal, the current paragraph would be processed just to check whether it is relevant
or not. The prior paragraph would not play a role in that case. However, if the prior
paragraph is related to the goal, then the current paragraph may be processed with the
idea of comparing it to the previous one.
We therefore analyzed two extreme cases: the words fixated in the prior paragraph
are strongly related to the goal or they are not related at all to the goal. We used two
thresholds of cosine similarity for that, which were set to 0.05 and 0.25. Paragraphs
whose semantic similarity with the goal falls in between were not considered. The first
case is called C—S (read the Current knowing that the previous one is Strong) and
the second one is called C—W (Current — Previous=Weak). We also analyzed cases
when no prior paragraph exists, called C—0 (Current — Nothing). Basic statistics show
that in terms of number of fixations, fixation duration and the shape of the scanpath,
C—W=C—0 and both are significantly different from C—S. It means that reading a
paragraph while the other one is not related to the goal is similar to reading the very
first paragraph, without information about a prior paragraph.
Therefore we will only consider the case C—S in this paper: reading a paragraph
with another one in mind which is highly related to the goal.
3.3 Modeling the Decision
Two Variables Involved. We first looked for the variables which could play a role in
the decision to stop reading a paragraph. Such a decision is made when the difference
99