5 CONCLUSION & OUTLOOK
In this work we presented the scenario of an uncer-
tain obstacle in automated driving. The example was
used to motivate and explain the use of POMDPs. We
derived its components and explained the functional-
ity of the ABT solver. Even though the chosen sce-
nario was kept simple, it is complex enough to point
out five different difficulties that have to be overcome
when trying to solve a real world problem. At the
simpler scenario with a fixed obstacle position, we
demonstrated the impact of the UCT-factor, balancing
exploration versus exploitation, and suggested using a
suitable estimate for the Q-value-function. Extending
the scenario to include continuous hidden states and
observations brought further problems. Namely, we
could show the need for discretizing observations in
order to prevent a degenerated tree, and pointed at the
influence of the particle filter. Lastly, the advantage of
using a heuristic function as a first estimate for a belief
value was explained.
Even though we did not solve a burning problem
in this work, we hope to pave the way for others into
POMDP-based behavior planning by bringing insights
into the mechanisms. Especially at the advent of paral-
lelizing solver algorithms (Cai et al., 2018), promising
to alleviate the massive drawback of the computational
burden, we expect to find POMDPs in more and more
applications. Apart from speeding up the algorithms,
we see need for further research in handling continuous
observations. Perhaps, smart discretization combined
with progressive widening may help in that regard.
ACKNOWLEDGMENT
This research was supported by AUDI AG.
REFERENCES
Cai, P., Luo, Y., Hsu, D., and Lee, W. S. (2018). Hyp-despot:
A hybrid parallel algorithm for online planning under
uncertainty.
Cou
¨
etoux, A., Hoock, J.-B., Sokolovska, N., Teytaud, O.,
and Bonnard, N. (2011). Continuous upper confidence
trees. In International Conference on Learning and
Intelligent Optimization, pages 433–445. Springer.
Gonz
´
alez, D. S., Garz
´
on, M., Dibangoye, J., and Laugier,
C. (2019). Human-like decision-making for automated
driving in highways. In 2019 IEEE 22nd Interna-
tional Conference on Intelligent Transportation Sys-
tems (ITSC).
Hubmann, C., Quetschlich, N., Schulz, J., Bernhard, J., Al-
thoff, D., and Stiller, C. (2019). A pomdp maneuver
planner for occlusions in urban scenarios. In 2019
IEEE Intelligent Vehicles Symposium (IV), pages 2172–
2179.
Hubmann, C., Schulz, J., Becker, M., Althoff, D., and Stiller,
C. (2018a). Automated driving in uncertain environ-
ments: Planning with interaction and uncertain ma-
neuver prediction. IEEE Transactions on Intelligent
Vehicles, 3(1):5–17.
Hubmann, C., Schulz, J., Xu, G., Althoff, D., and Stiller, C.
(2018b). A belief state planner for interactive merge
maneuvers in congested traffic. In 2018 IEEE 21st
International Conference on Intelligent Transportation
Systems (ITSC), pages 1617–1624.
Klimenko, D., Song, J., and Kurniawati, H. (2014). Tapir: A
software toolkit for approximating and adapting pomdp
solutions online. In Proceedings of the Australasian
Conference on Robotics and Automation, Melbourne,
Australia, volume 24.
Kocsis, L. and Szepesv
´
ari, C. (2006). Bandit based monte-
carlo planning. In European conference on machine
learning, pages 282–293.
Kurniawati, H. and Yadav, V. (2016). An online pomdp
solver for uncertainty planning in dynamic environ-
ment. In Robotics Research, pages 611–629. Springer.
Papadimitriou, C. H. and Tsitsiklis, J. N. (1987). The com-
plexity of markov decision processes. Mathematics of
operations research, 12(3):441–450.
Russell, S. J. (1998). Learning agents for uncertain environ-
ments. In COLT, volume 98, pages 101–103.
Sch
¨
orner, P., T
¨
ottel, L., Doll, J., and Z
¨
ollner, J. M. (2019).
Predictive trajectory planning in situations with hidden
road users using partially observable markov decision
processes. In 2019 IEEE Intelligent Vehicles Sympo-
sium (IV), pages 2299–2306.
Silver, D. and Veness, J. (2010). Monte-carlo planning
in large pomdps. In Advances in neural information
processing systems (NIPS), pages 2164–2172.
Somani, A., Ye, N., Hsu, D., and Lee, W. S. (2013).
Despot: Online pomdp planning with regularization.
In Advances in neural information processing systems
(NIPS), pages 1772–1780.
Sunberg, Z. N. and Kochenderfer, M. J. (2018). Online
algorithms for pomdps with continuous state, action,
and observation spaces. In Twenty-Eighth International
Conference on Automated Planning and Scheduling.
Thrun, S., Burgard, W., and Fox, D. (2005). Probabilistic
robotics. MIT press.
Treiber, M. and Kesting, A. (2013). Traffic Flow Dynamics:
Data, Models and Simulation. Springer.
Ye, N., Somani, A., Hsu, D., and Lee, W. S. (2017). Despot:
Online pomdp planning with regularization. Journal
of Artificial Intelligence Research, 58:231–266.
Tutorial on Sampling-based POMDP-planning for Automated Driving
321