
 
recognition of the 1st-level scenes can be put in the 
base of the method of s-layer scenes 
12
11 1
...
v
ss s 
 
 recognition as 
follows. 
Step 1. Each object 
r
lll
,...,,
21
 that is 
included at least in one 1st-level scene 
1
 is 
recognized by separate comparison with reference 
objects 
1
2
12
, ,...,
ll
l
r
r
kk
k
ll l
 
 
1,..., ,
rr
ll
kK
 
12
{ , ,..., } {1,..., }
r
ll l L
 using fusion 
operators 
12
, ,..., .
r
ll l
AA
 
Step 2. Each 1st-level scene 
1
 for all objects 
of which are found similar reference objects is 
considered recognized and a similarity criterion is 
extracted for it (the value of the fusion operator) 
1
A
. After this go to step 3. If there was found no 
such scenes then there’s no recognized scenes of the 
first level and higher and the execution is stopped. 
Step 3. The value of level is set to 2 and we go to 
step 4. 
Step 4. If there were found s-level scenes 
 for 
all (s-1)-level scenes of which had been found 
nonzero values of the similarity criterion then these 
scenes 
 are considered recognized and similarity 
criteria (fusion operator values) 
A
are calculated 
for them. If there are any (s+1)-level scenes then 
step 4 is executed once again with the value s=s+1. 
Else the execution is stopped. 
If there were found no s-level scenes 
 for all 
(s-1)-level scenes of which had been found nonzero 
values of the similarity criterion then there are no 
recognized s-level scenes and the execution is 
stopped. 
5 CONCLUSIONS 
There is considered a developed method that uses 
fuzzy fusion operators with fuzzy measure for data 
fusion in multimodal interface. The main advantages 
of the method from the well-known analogs are the 
following: 
 The ability of hierarchic multimodal recognition of 
scenes that consist of static and dynamic (moving) 
objects. 
 The ability to consider the measure of importance 
of each modality during the process of hierarchic 
scene recognition due to the use of fusion 
operators that use a fuzzy measure. 
 The ability to be the base for developing of control 
systems for different objects with the help of 
dynamic patterns (robots, computers, TV-sets etc.). 
 The ability to increase the reliability of recognition 
of separate objects (for example, a human being) 
in the scene using relations between these objects 
and other objects of the scene (background 
objects). 
 Promising opportunities for development of 
intellectual and intuitive human-machine interfaces 
by using more modalities and relations. 
REFERENCES 
Alatan, A. A. Automatic multimodal dialogue scene 
indexing // Proc. of image processing.- 2001.- Vol. 3.-
P. 374-377. 
Averkin, A. A., Batyrshin, I.Z., Blishun, A.F. Fuzzy sets 
in control models and artificial intelligence systems // 
Moscow . Book, Nauka, 1986. 
Devyatkov V., Alfimtsev A. Optimal Fuzzy Aggregation 
of Secondary Attributes in Recognition Problems // 
Proc. of 16-th Int. Conf. in Central Europe on Comp. 
Graphics, Visual. and Computer Vision.-Plzen, 2008.-
P. 78–85. 
Grabisch M., Roubens M. Application of the Choquet 
Integral in Multicriteria Decision Making. In Fuzzy 
Measures and Integrals- Theory and Applications, 
Physica Verlag, 2000, pp. 415-434. 
Higuchi M. and oth. Scene Recognition Based on 
Relationship between Human Actions and Objects // 
17th Int. Conf. on Pattern Recog.-2004.- Vol. 3.- P. 
73-78. 
Liu F., Lin X. Multimodal face tracking using Bayesian 
network // IEEE Internat. Workshop on Analysis and 
Modeling of Faces and Gestures.- Nice, 2003.-P. 135-
142. 
Marichal J. On Choquet and Sugeno Integrals as 
Aggregation Functions // In Fuzzy Measures and 
Integrals.-2000.-Vol. 40.-P. 247-272. 
Ronshin A. L., Karpov A.A., Li I.V. Speech and 
multimodal interface // Moscow. Book, Nauka, 2006. 
Sharma R. Speech-Gesture Driven Multimodal Interfaces 
for Crisis Management// The IEEE Proc.-2003.-Vol. 
91, №9.- P. 1327–1354. 
Wang X., Chen J. Multiple Neural Networks fusion model 
based on Choquet fuzzy integral // Proc. of the Third 
Intern. Conf. on Mach. Learn. and Cybern.- Vol.4.-P. 
2024-2027. 
BIODEVICES2013-InternationalConferenceonBiomedicalElectronicsandDevices
310