Speeding up Online POMDP Planning - Unification of Observation Branches by Belief-state Compression Via Expected Feature Values

Gavin Rens

doi:10.5220/0005165802410246

Speeding up Online POMDP Planning - Unification of Observation Branches by Belief-state Compression Via Expected Feature Values

Gavin Rens

2015

Abstract

A novel algorithm to speed up online planning in partially observable Markov decision processes (POMDPs) is introduced. I propose a method for compressing nodes in belief-decision-trees while planning occurs. Whereas belief-decision-trees branch on actions and observations, with my method, they branch only on actions. This is achieved by unifying the branches required due to the nondeterminism of observations. The method is based on the expected values of domain features. The new algorithm is experimentally compared to three other online POMDP algorithms, outperforming them on the given test domain.

References

He, R., Brunskill, E., and Roy, N. (2011). Efficient planning under uncertainty with macro-actions. Journal of Artificial Intelligence Research (JAIR), 40:523-570.
Kaelbling, L., Littman, M., and Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1-2):99-134.
Koenig, S. (2001). Agent-centered search. Artificial Intelligence Magazine, 22:109-131.
Kurniawati, H., Du, Y., Hsu, D., and Lee, W. (2011). Motion planning under uncertainty for robotic tasks with long time horizons. International Journal of Robotics Research, 30(3):308-323.
Li, X., Cheung, W., Liu, J., and Wu, Z. (2007). A novel orthogonal NMF-based belief compression for POMDPs. In Proceedings of the Twenty-fourth International Conference on Machine Learning (ICML07), pages 537-544, New York, NY, USA. ACM Press.
Lovejoy, W. (1991). A survey of algorithmic methods for partially observed Markov decision processes. Annals of Operations Research, 28:47-66.
McAllester, D. and Singh, S. (1999). Approximate planning for factored POMDPs using belief state simplification. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), pages 409-416, San Francisco, CA. Morgan Kaufmann.
Monahan, G. (1982). A survey of partially observable Markov decision processes: Theory, models, and algorithms. Management Science, 28(1):1-16.
Paquet, S., Tobin, L., and Chaib-draa, B. (2005). Real-time decision making for large POMDPs. In Advances in Artificial Intelligence: Proceedings of the Eighteenth Conference of the Canadian Society for Computational Studies of Intelligence, volume 3501 of Lecture Notes in Computer Science, pages 450-455. Springer Verlag.
Pineau, J., Gordon, G., and Thrun, S. (2003). Point-based value iteration: An anytime algorithm for POMDPs. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pages 1025-1032.
Poupart, P. and Boutilier, C. (2003). Value-directed compression of POMDPs. In Advances in Neural Information Processing Systems (NIPS 2003), pages 1547- 1554. MIT Press, Massachusetts/England.
Rens, G. and Ferrein, A. (2013). Belief-node condensation for online pomdp algorithms. In Proceedings of IEEE AFRICON 2013, pages 1270-1274, Red Hook, NY 12571 USA. Institute of Electrical and Electronics Engineers, Inc.
Ross, S., Pineau, J., Paquet, S., and Chaib-draa, B. (2008). Online planning algorithms for POMDPs. Journal of Artificial Intelligence Research (JAIR), 32:663-704.
Roy, N., Gordon, G., and Thrun, S. (2005). Finding approximate POMDP solutions through belief compressions. Journal of Artificial Intelligence Research (JAIR), 23:1-40.

Download

Paper Citation

in Harvard Style

Rens G. (2015). Speeding up Online POMDP Planning - Unification of Observation Branches by Belief-state Compression Via Expected Feature Values . In Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-074-1, pages 241-246. DOI: 10.5220/0005165802410246

in Bibtex Style

@conference{icaart15,
author={Gavin Rens},
title={Speeding up Online POMDP Planning - Unification of Observation Branches by Belief-state Compression Via Expected Feature Values},
booktitle={Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2015},
pages={241-246},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005165802410246},
isbn={978-989-758-074-1},
}

in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Speeding up Online POMDP Planning - Unification of Observation Branches by Belief-state Compression Via Expected Feature Values
SN - 978-989-758-074-1
AU - Rens G.
PY - 2015
SP - 241
EP - 246
DO - 10.5220/0005165802410246