3 EXTRACTION OF RELATIONS
First, we discuss the influence on the student move-
ments by lecturer movement. The input-output rela-
tion between the two movements can be represented
by the inputs ¯x
L
speech
(t) (the loudness of speech by lec-
turer), x
L
face
(t) (the number of skin-colored pixels in
lecturer’s face region), and x
L
hand
(t) (the number of
skin-colored pixels in lecturer’s hand region) and the
output x
S
face
(t, p) (the number of skin-colored pixels
in the face region for the p-th student). Moreover, the
above two movements are changed according to time
t and they are interacted with each other with the time
delay ℓ. Therefore, we can model the two features by
the following time-series model;
x
S
face
(t, p) = f
LS
(x
L
face
(t − ℓ),x
L
hand
(t − ℓ),
x
L
speech
(t − ℓ);w
LS
i j
(p)), (3)
where ℓ = 1,2, ··· , T ), T and p denotes the length
of the objective section and the student number re-
spectively. In this equation, a function f
LS
(·) is un-
known and the value of coefficients w
LS
i j
(p) are un-
known. Therefore, we use a neural network model as
shown in Fig. 3. Unknown coefficients w
LS
i j
(p) denote
weights in this neural network model.
Lecturer
Number of Pixels
in Face Region
Number of Pixels
in Hand Region
Amplitude
of Speech
Number of Pixels
in Face Region
Number of Pixels
in Hand Region
Amplitude
of Speech
Each Student
Number of
Pixels in
Face Region
at (t-1)
at (t-T)
at (t)
{w
LS
ij
(p)}
Figure 3: Neural network model for the learning of the in-
fluence on students movements by lecturer movements.
Neural network models have been widely applied
to the numerous fields by their non-linear mapping
ability. Rumelhart et al. (Rumelhart and McClelland,
1986) have proposed back-propagation (BP) learning
algorithm for neural networks. However, since neu-
ral networks trained by BP algorithm have distributed
representations in all hidden units, it is difficult to ex-
plain clearly the obtained rules and knowledge.
For this difficulty, Ishikawa (Ishikawa, 1996) has
proposed a structural learning algorithm with forget-
ting (SLF) for the purpose of clarifying internal repre-
sentations of neural networks. In SLF learning algo-
rithm, weights in neural networks can be updated so
as to minimize the following error function including
the additive term
∑
|w
i j
|;
E
F
=
∑
k
(t(k) −o(k))
2
+ ε
∑
|w
i j
|, (4)
where t(k), o(k) denote teaching signal and outputs
of neural network for the k-th pattern respectively.
Moreover, ε denotes the amount of forgetting for
weights and SLF algorithm can reduce the value of
redundant weights by setting adequate ε. As one in-
dex for the clarifying for neural network model, we
introduce the entropy H defined by
H = −
∑
i, j
p
i j
log p
i j
,
where p
i j
= |w
i j
|/
∑
i j
|w
i j
|. When H
w
becomes small,
the number of redundant weights becomes small and
we can evaluate that the objective neural network
model has “clear” internal representation.
In the following section, we discuss the interaction
between lecturer and students in an actual lecture by
focusing on weights in neural network models.
4 ANALYSIS RESULTS
We have recorded images and speech for lecturer and
students in a lecture (Title: “Application of Inter-
net technology to teaching Japanese as a foreign lan-
guage,”. In this lecture, the lecturer explained the out-
line by speech during the first 10 minutes. We adopt
the lecture scene during the first 10 minutes. Fig.4
shows the layout of the lecture room and the images
for lecturer and students were recorded by two digi-
tal video cameras as shown in this figure respectively.
These images were recorded by the rate 10 [fps] and
the size of 640×480 [pixels].
1 2 3
4 5 6 7 8
Lecturer
Screen
Cameras
1 2
3
4
5
6
7 8
Figure 4: Layout of the lecture room.
Moreover, we have set a few parameters as fol-
lows; the number of students: 8, ∆ (section length
for the moving average of speech): 0.5 [sec], ε
Red
(threshold for the detection of red-colored pixel): 80,
∆
Green
and ∆
Blue
: 15, 15 and T (the length of the ob-
jective section): 10[sec].
EXTRACTION OF RELATIONS BETWEEN LECTURER AND STUDENTS BY USING MULTI-LAYERED NEURAL
NETWORKS
77