COLLECTIVE DECISION UNDER PARTIAL OBSERVABILITY - A Dynamic Local Interaction Model

Arnaud Canu; Abdel-Illah Mouaddib

doi:10.5220/0003643801460155

COLLECTIVE DECISION UNDER PARTIAL OBSERVABILITY - A Dynamic Local Interaction Model

Arnaud Canu, Abdel-Illah Mouaddib

2011

Abstract

This paper introduces DyLIM, a new model to describe partially observable multiagent decision making problems under uncertainty. DyLIM deals with local interactions amongst the agents, and build the collective behavior from individual ones. Usually, such problems are described using collaborative stochastic games, but this model makes the strong assumption that agents are interacting all the time with all the other agents. With DyLIM, we relax this assumption to be more appropriate to real-life applications, by considering that agents interact sometimes with some agents. We are then able to describe the multiagent problem as a set of individual problems (sometimes interdependent), which allow us to break the combinatorial complexity. We introduce two solving algorithms for this model and we evaluate them on a set of dedicated benchmarks. Then, we show how our approach derive near optimal policies, for problems involving a large number of agents.

References

Bernstein, D., Zilberstein, S., and Immerman, N. (2000). The complexity of decentralized control of markov decision processes. In Proc. of UAI.
Boutilier, C. (1996). Planning, learning and coordination in multiagent decision processes. In TARK.
Cassandra, A., Kaelbling, L., and Littman, M. (1994). Acting optimally in partially observable stochastic domains. In Proc. of AAAI.
Kumar, A. and Zilberstein, S. (2009). Constraint-based dynamic programming for decentralized POMDPs with structured interactions. In Proc. of AAMAS.
Kurniawati, H., Hsu, D., and Lee, W. (2008). SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems.
Littman, M., Cassandra, A., and Pack Kaelbling, L. (1995). Learning policies for partially observable environments: Scaling up. In Machine Learning, pages 362- 370.
Nair, R., Varakantham, P., Tambe, M., and Yokoo, M. (2005). Networked distributed pomdps: A synthesis of distributed constraint optimization and pomdps. In Proc. of AAAI.
Oliehoek, F., Spaan, M., Whiteson, S., and Vlassis, N. (2008). Exploiting locality of interaction in factored Dec-POMDPs. In Proc. of AAMAS.
Rabinovich, Z., Goldman, C., and Rosenschein, J. (2003). The complexity of multiagent systems: The price of silence. In Proc. of AAMAS.
Shapley, L. (1953). Stochastic games. In National Academy of Sciences.
Spaan, M. and Melo, F. (2008). Interaction-driven Markov games for decentralized multiagent planning under uncertainty. In Proc. of AAMAS.
Varakantham, P., Kwak, J., Taylor, M., Marecki, J., Scerri, P., and Tambe, M. (2009). Exploiting coordination locales in distributed POMDPs via social model shaping. In Proc. of ICAPS.

Download

Paper Citation

in Harvard Style

Canu A. and Mouaddib A. (2011). COLLECTIVE DECISION UNDER PARTIAL OBSERVABILITY - A Dynamic Local Interaction Model . In Proceedings of the International Conference on Evolutionary Computation Theory and Applications - Volume 1: ECTA, (IJCCI 2011) ISBN 978-989-8425-83-6, pages 146-155. DOI: 10.5220/0003643801460155

in Bibtex Style

@conference{ecta11,
author={Arnaud Canu and Abdel-Illah Mouaddib},
title={COLLECTIVE DECISION UNDER PARTIAL OBSERVABILITY - A Dynamic Local Interaction Model},
booktitle={Proceedings of the International Conference on Evolutionary Computation Theory and Applications - Volume 1: ECTA, (IJCCI 2011)},
year={2011},
pages={146-155},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003643801460155},
isbn={978-989-8425-83-6},
}

in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Evolutionary Computation Theory and Applications - Volume 1: ECTA, (IJCCI 2011)
TI - COLLECTIVE DECISION UNDER PARTIAL OBSERVABILITY - A Dynamic Local Interaction Model
SN - 978-989-8425-83-6
AU - Canu A.
AU - Mouaddib A.
PY - 2011
SP - 146
EP - 155
DO - 10.5220/0003643801460155