5 CONCLUSION AND FUTURE
WORK
In this work, we presented an active vision problem
in the context of Robocup which intends to control
the head movements of a humanoid soccer robot. The
method formulated the problem as a Markov Decision
Process using Deep Reinforcement Learning. In
the action selection phase (at the beginning of each
episode), we used an entropy-minimising method
applied to the UKF model which is responsible for
the robot localisation. In this work we applied the
DQN algorithm. The results of the trained model
were presented and analysed. The proposed method
operates without reliance on the current belief of the
environment and is compared with the previous works
which only use entropy minimisation for the real-time
head control.
We defined the problem as a Markov Decision
Process. Therefore the problem can be solved
with newer algorithms of reinforcement learning that
consider the continuous action space such as PPO
(Schulman et al., 2017) and DDPG (Lillicrap et al.,
2015). Also, the performance of the method might
be improved by passing a rough representation of the
robot position along with the image. This can handle
the problem of similarity between the symmetric
observations in the soccer field.
REFERENCES
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z.,
Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin,
M., et al. (2016). Tensorflow: Large-scale machine
learning on heterogeneous distributed systems. arXiv
preprint arXiv:1603.04467.
Ammirato, P., Poirson, P., Park, E., Ko
ˇ
seck
´
a, J., and
Berg, A. C. (2017). A dataset for developing
and benchmarking active vision. In 2017 IEEE
International Conference on Robotics and Automation
(ICRA), pages 1378–1385. IEEE.
Bajcsy, R. (1988). Active perception. Proceedings of the
IEEE, 76(8):966–1005.
Burgard, W., Fox, D., and Thrun, S. (1997). Active
mobile robot localization by entropy minimization. In
Proceedings second euromicro workshop on advanced
mobile robots, pages 155–162. IEEE.
Chen, S., Li, Y., and Kwok, N. M. (2011). Active vision
in robotic systems: A survey of recent developments.
The International Journal of Robotics Research,
30(11):1343–1377.
Cheng, R., Agarwal, A., and Fragkiadaki, K. (2018).
Reinforcement learning of active vision for
manipulating objects under occlusions. arXiv
preprint arXiv:1811.08067.
Czarnetzki, S., Kerner, S., and Kruse, M. (2010). Real-time
active vision by entropy minimization applied to
localization. In Robot Soccer World Cup, pages
266–277. Springer.
Dhariwal, P., Hesse, C., Klimov, O., Nichol, A., Plappert,
M., Radford, A., Schulman, J., Sidor, S., Wu, Y., and
Zhokhov, P. (2017). Openai baselines. https://github.
com/openai/baselines.
Falanga, D., Mueggler, E., Faessler, M., and Scaramuzza,
D. (2017). Aggressive quadrotor flight through narrow
gaps with onboard sensing and computing using active
vision. In 2017 IEEE international conference on
robotics and automation (ICRA), pages 5774–5781.
IEEE.
Han, X., Liu, H., Sun, F., and Zhang, X. (2019). Active
object detection with multistep action prediction using
deep q-network. IEEE Transactions on Industrial
Informatics, 15(6):3723–3731.
Hill, A., Raffin, A., Ernestus, M., Gleave, A., Kanervisto,
A., Traore, R., Dhariwal, P., Hesse, C., Klimov, O.,
Nichol, A., Plappert, M., Radford, A., Schulman, J.,
Sidor, S., and Wu, Y. (2018). Stable baselines. https:
//github.com/hill-a/stable-baselines.
Kieras, D. E. and Hornof, A. J. (2014). Towards accurate
and practical predictive models of active-vision-based
visual search. In Proceedings of the SIGCHI
conference on human factors in computing systems,
pages 3875–3884.
Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez,
T., Tassa, Y., Silver, D., and Wierstra, D. (2015).
Continuous control with deep reinforcement learning.
arXiv preprint arXiv:1509.02971.
Mahmoudi, H., Fatehi, A., Gholami, A., Delavaran, M. H.,
Khatibi, S., Alaee, B., Tafazol, S., Abbasi, M., Doust,
M. Y., Jafari, A., et al. (2019). Mrl team description
paper for humanoid kidsize league of robocup 2019.
Mechatronics Research Lab, Department of Computer
and Electrical Engineering, Qazvin Islamic Azad
University, Qazvin, Iran.
Mattamala, M., Villegas, C., Y
´
a
˜
nez, J. M., Cano, P., and
Ruiz-del Solar, J. (2015). A dynamic and efficient
active vision system for humanoid soccer robots. In
Robot Soccer World Cup, pages 316–327. Springer.
Michel, O. (2004). Cyberbotics ltd. webots™: professional
mobile robot simulation. International Journal of
Advanced Robotic Systems, 1(1):5.
Mitchell, J. F., Reynolds, J. H., and Miller, C. T.
(2014). Active vision in marmosets: a model system
for visual neuroscience. Journal of Neuroscience,
34(4):1183–1194.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A.,
Veness, J., Bellemare, M. G., Graves, A., Riedmiller,
M., Fidjeland, A. K., Ostrovski, G., et al. (2015).
Human-level control through deep reinforcement
learning. nature, 518(7540):529–533.
Rezaei, M. and Klette, R. (2017). Computer vision for
driver assistance. Springer.
Rolfs, M. (2015). Attention in active vision: A perspective
on perceptual continuity across saccades. Perception,
44(8-9):900–919.
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
750