Practical Assumptions for Planning Under Uncertainty

Juan Carlos Saborío, Joachim Hertzberg

2017

Abstract

The (PO)MDP framework is a standard model in planning and decision-making under uncertainty, but the complexity of its methods makes it impractical for any reasonably large problem. In addition, task-planning demands solutions satisfying efficiency and quality criteria, often unachievable through optimizing methods. We propose an approach to planning that postpones optimality in favor of faster, satisficing behavior, supported by context-sensitive assumptions that allow an agent to reduce the dimensionality of its decision problems.We argue that a practical problem solving agent may sometimes assume full observability and determinism, based on generalizations, domain knowledge and an attentional filter obtained through a formal understanding of “relevance”, therefore exploiting the structure of problems and not just their representations.

References

  1. Boutilier, C., Dean, T., and Hanks, S. (1996). Planning under uncertainty: Structural assumptions and computational leverage. In In Proc. 2nd Eur. Worksh. Planning, pages 157-171. IOS Press.
  2. Ghallab, M., Nau, D., and Traverso, P. (2016). Automated Planning and Acting. Cambridge University Press, San Francisco, CA, USA.
  3. Grzes, M. and Kudenko, D. (2008). Plan-based reward shaping for reinforcement learning. In Intelligent Systems, 2008. IS 7808. 4th Intl. IEEE Conf., volume 2, pages 10-22-10-29.
  4. Hanheide, M., Göbelbecker, M., Horn, G. S., Pronobis, A., Sjöö, K., Aydemir, A., Jensfelt, P., Gretton, C., Dearden, R., Janicek, M., Zender, H., Kruijff, G.-J., Hawes, N., and Wyatt, J. L. (2015). Robot task planning and explanation in open and uncertain worlds. Artificial Intelligence, pages -.
  5. Hertzberg, J., Zhang, J., Zhang, L., Rockel, S., Neumann, B., Lehmann, J., Dubba, K. S. R., Cohn, A. G., Saffiotti, A., Pecora, F., Mansouri, M., Konec?nÉ, S?., Günther, M., Stock, S., Lopes, L. S., Oliveira, M., Lim, G. H., Kasaei, H., Mokhtari, V., Hotz, L., and Bohlken, W. (2014). The RACE project. KI - Künstliche Intelligenz, 28(4):297-304.
  6. Hester, T. and Stone, P. (2013). TEXPLORE: Real-time sample-efficient reinforcement learning for robots. Machine Learning, 90(3).
  7. Kearns, M., Mansour, Y., and Ng, A. Y. (2002). A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes. Mach. Learn., 49(2-3):193-208.
  8. Kocsis, L. and Szepesvári, C. (2006). Bandit based Monte-Carlo Planning. In ECML-06, pages 282-293. Springer.
  9. Kushmerick, N., Hanks, S., and Weld, D. (1994). An algorithm for probabilistic least-commitment planning. In AAAI-94, pages 1073-1078.
  10. Ng, A. Y., Harada, D., and Russell, S. (1999). Policy invariance under reward transformations: Theory and application to reward shaping. In In Proc. 16th Intl. Conf. Mach. Learn., pages 278-287. Morgan Kaufmann.
  11. Pineau, J., Gordon, G., and Thrun, S. (2003). Policycontingent abstraction for robust robot control. In Proc. 19th Conf. on Uncertainty in Artificial Intelligence, UAI'03, pages 477-484, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
  12. Pineau, J., Gordon, G. J., and Thrun, S. (2006). Anytime point-based approximations for large POMDPs. Journal of Artificial Intelligence Research, 27:335-380.
  13. Silver, D. and Veness, J. (2010). Monte-Carlo Planning in Large POMDPs. In In Advances in Neural Information Processing Systems 23, pages 2164-2172.
  14. Singh, S. P., Jaakkola, T., and Jordan, M. I. (1995). Reinforcement learning with soft state aggregation. In Tesauro, G., Touretzky, D. S., and Leen, T. K., editors, Advances in Neural Information Processing Systems 7, pages 361-368. MIT Press.
  15. Smith, T. and Simmons, R. (2004). Heuristic Search Value Iteration for POMDPs. In Proc. 20th Conf. on Uncertainty in Artificial Intelligence, UAI 7804, pages 520- 527, Arlington, Virginia, United States. AUAI Press.
  16. Sperber, D. and Wilson, D. (1995). Relevance: Communication and Cognition. Blackwell Publishers, Cambridge, MA, USA, 2nd edition.
  17. Sutton, R. S. and Barto, A. G. (2012). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, USA, 2nd edition. (to be published).
  18. Thiébaux, S. and Hertzberg, J. (1992). A semi-reactive planner based on a possible models action formalization. In Artificial Intelligence Planning Systems: Proc. 1st Intl. Conf. (AIPS92), pages 228-235. Morgan Kaufmann.
  19. Vien, N. A. and Toussaint, M. (2015). Hierarchical MonteCarlo Planning. In AAAI-15.
  20. von Neumann, J. and Morgenstern, O. (1944). Theory of Games and Economic Behavior. Princeton University Press.
Download


Paper Citation


in Harvard Style

Saborío J. and Hertzberg J. (2017). Practical Assumptions for Planning Under Uncertainty . In Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-220-2, pages 497-502. DOI: 10.5220/0006189004970502


in Bibtex Style

@conference{icaart17,
author={Juan Carlos Saborío and Joachim Hertzberg},
title={Practical Assumptions for Planning Under Uncertainty},
booktitle={Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2017},
pages={497-502},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006189004970502},
isbn={978-989-758-220-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Practical Assumptions for Planning Under Uncertainty
SN - 978-989-758-220-2
AU - Saborío J.
AU - Hertzberg J.
PY - 2017
SP - 497
EP - 502
DO - 10.5220/0006189004970502