
coverage tasks by addressing spatial and resource
constraints. Its ability to integrate enriched observa-
tions, prioritize meaningful experiences, and promote
adaptive exploration makes it a promising solution
for real-world applications in structured and resource-
constrained environments.
ACKNOWLEDGMENTS
The second author was in part supported by a research
grant from Google.
REFERENCES
Atınc¸, G. M., Stipanovi
´
c, D. M., and Voulgaris, P. G.
(2020). A swarm-based approach to dynamic cov-
erage control of multi-agent systems. IEEE Trans-
actions on Control Systems Technology, 28(5):2051–
2062.
Burgard, W., Moors, M., Stachniss, C., and Schneider, F. E.
(2005). Coordinated multi-robot exploration. IEEE
Transactions on Robotics, 21(3):376–386.
Chen, X., Tucker, T. M., Kurfess, T. R., and Vuduc, R.
(2019a). Adaptive deep path: efficient coverage of a
known environment under various configurations. In
2019 IEEE/RSJ International Conference on Intelli-
gent Robots and Systems (IROS), pages 3549–3556.
IEEE.
Chen, Z., Subagdja, B., and Tan, A.-H. (2019b). End-to-end
deep reinforcement learning for multi-agent collabo-
rative exploration. In 2019 IEEE International Con-
ference on Agents (ICA), pages 99–102. IEEE.
Choi, H.-B., Kim, J.-B., Han, Y.-H., Oh, S.-W., and Kim,
K. (2022). Marl-based cooperative multi-agv con-
trol in warehouse systems. IEEE Access, 10:100478–
100488.
Garaffa, L. C., Basso, M., Konzen, A. A., and de Fre-
itas, E. P. (2021). Reinforcement learning for mobile
robotics exploration: A survey. IEEE Transactions on
Neural Networks and Learning Systems, 34(8):3796–
3810.
Gazi, V. and Passino, K. M. (2004). Stability analysis of
swarms. IEEE Transactions on Automatic Control,
48(4):692–697.
Ghaddar, A. and Merei, A. (2020). Eaoa: Energy-aware
grid-based 3d-obstacle avoidance in coverage path
planning for uavs. Future Internet, 12(2):29.
Hu, J., Niu, H., Carrasco, J., Lennox, B., and Arvin, F.
(2020). Voronoi-based multi-robot autonomous ex-
ploration in unknown environments via deep rein-
forcement learning. IEEE Transactions on Vehicular
Technology, 69(12):14413–14423.
Jin, Y., Zhang, Y., Yuan, J., and Zhang, X. (2019). Efficient
multi-agent cooperative navigation in unknown envi-
ronments with interlaced deep reinforcement learn-
ing. In ICASSP 2019-2019 IEEE International Con-
ference on Acoustics, Speech and Signal Processing
(ICASSP), pages 2897–2901. IEEE.
Khamis, A., Hussein, A., and Elmogy, A. (2015). Multi-
robot task allocation: A review of the state-of-the-art.
Cooperative robots and sensor networks 2015, pages
31–51.
Li, W., Zhao, T., and Dian, S. (2022). Multirobot coverage
path planning based on deep q-network in unknown
environment. Journal of Robotics, 2022(1):6825902.
Lowe, R., Wu, Y. I., Tamar, A., Harb, J., Pieter Abbeel,
O., and Mordatch, I. (2017). Multi-agent actor-critic
for mixed cooperative-competitive environments. Ad-
vances in neural information processing systems, 30.
Luo, T., Subagdja, B., Wang, D., and Tan, A.-H. Multi-
agent collaborative exploration through graph-based
deep reinforcement learning. In 2019 IEEE Interna-
tional Conference on Agents (ICA), pages 2–7. IEEE.
Orr, J. and Dutta, A. (2023). Multi-agent deep reinforce-
ment learning for multi-robot applications: A survey.
Sensors, 23(7):3625.
Sanghvi, N., Niyogi, R., and Milani, A. (2024). Sweeping-
based multi-robot exploration in an unknown environ-
ment using webots. In ICAART (1), pages 248–255.
Schaul, T. (2015). Prioritized experience replay. arXiv
preprint arXiv:1511.05952.
Setyawan, G. E., Hartono, P., and Sawada, H. (2022). Coop-
erative multi-robot hierarchical reinforcement learn-
ing. Int. J. Adv. Comput. Sci. Appl, 13:35–44.
Stentz, A. (1994). Optimal and efficient path planning for
partially-known environments. In Proceedings of the
1994 IEEE international conference on robotics and
automation, pages 3310–3317. IEEE.
Tan, C. S., Mohd-Mokhtar, R., and Arshad, M. R. (2021).
A comprehensive review of coverage path planning
in robotics using classical and heuristic algorithms.
IEEE Access, 9:119310–119342.
Tran, V. P., Garratt, M. A., Kasmarik, K., Anavatti, S. G.,
and Abpeikar, S. (2022). Frontier-led swarming: Ro-
bust multi-robot coverage of unknown environments.
Swarm and Evolutionary Computation, 75:101171.
Yamauchi, B. (1997). A frontier-based approach for au-
tonomous exploration. In Proceedings 1997 IEEE In-
ternational Symposium on Computational Intelligence
in Robotics and Automation CIRA’97.’Towards New
Computational Principles for Robotics and Automa-
tion’, pages 146–151. IEEE.
Yanguas-Rojas, D. and Mojica-Nava, E. (2017). Explo-
ration with heterogeneous robots networks for search
and rescue. IFAC-PapersOnLine, 50(1):7935–7940.
Zhang, H., Cheng, J., Zhang, L., Li, Y., and Zhang, W.
(2022). H2gnn: Hierarchical-hops graph neural net-
works for multi-robot exploration in unknown envi-
ronments. IEEE Robotics and Automation Letters,
7(2):3435–3442.
Zhelo, O., Zhang, J., Tai, L., Liu, M., and Burgard,
W. (2018). Curiosity-driven exploration for mapless
navigation with deep reinforcement learning. arXiv
preprint arXiv:1804.00456.
Efficient Multi-Agent Exploration in Area Coverage Under Spatial and Resource Constraints
1287