proposes a procedure to add any sort of contextual in-
formation that can be further generalized to include
other data apart from the object used during an ac-
tion. Additionally, the present approach shows that
the best results are obtained when kernels from spa-
cial, temporal, and tool informationare combined into
a multichannel SVM kernel. In this respect, the high-
est recognition rates are 71.57% using a combination
of trajectories, HOG and object. In the near future we
plan to add more contextual information –scene– in
order to improve the results.
This research has been partially supported by the
Industrial Doctorate program of the Government of
Catalonia, and by the European Community through
the FP7 framework program by funding the Vinbot
project (N 605630) conducted by Ateknea Solutions
