Unreal engine was used to emulate the fire environ-
ment and AirSim was used to communicate data and
controls between the virtual environment to the deep
learning model. The agent was successfully able to
navigate extreme fires based on its acquired knowl-
edge and experience.
This work serves as the foundation on which to
build a deep learning framework that is capable of
identifying objects within the environment and incor-
porating those objects into its decision making pro-
cess in order to successfully deliver safe, navigable
routes to firefighters.
The learning process is currently slow and needs
several hours of training. In the future, we aim to uti-
lize A2C and A3C based reinforcement learning mod-
els to train a shared model utilized in parallel by mul-
tiple agents with multiple goals simultaneously. we
also aim to use the deep learning-based results such as
object detection, tracking, and segmentation to create
a more informative situational awareness map of the
reconstructed 3d scene.
The proposed system is intended to be integrated
in a geographic and visual environment with data of
the floor plan, which will also include scene infor-
mation about the fire locations, doors, windows, de-
tected firefighters, health condition of the firefighters
and other features that are collected from the sensors
carried in the fire fighter gear, which will be transmit-
ted over a robust communication system to an inci-
dent commander to produce a fully flexed situational
awareness system.
ACKNOWLEDGEMENTS
This work was supported by the National Science
Foundation (NSF) Smart & Connected Communities
(S&CC) Early-Concept Grants For Exploratory Re-
search (EAGER) under Grant 1637092. We would
like to thank the UNM Center for Advanced Research
Computing, supported in part by the National Sci-
ence Foundation, for providing the high-performance
computing, large-scale storage, and visualization re-
sources used in this work. We would also like to thank
Sophia Thompson for her valuable suggestions and
contributions to the edits of the final drafts.
REFERENCES
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A.,
Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard,
M., et al. (2016). Tensorflow: A system for large-
scale machine learning. In 12th {USENIX} sympo-
sium on operating systems design and implementation
({OSDI} 16), pages 265–283.
Anderson, J. D. and Wendt, J. (1995). Computational fluid
dynamics, volume 206. Springer.
Bae, H., Kim, G., Kim, J., Qian, D., and Lee, S. (2019).
Multi-robot path planning method using reinforce-
ment learning. Applied Sciences, 9(15):3057.
Beachly, E., Detweiler, C., Elbaum, S., Duncan, B., Hilde-
brandt, C., Twidwell, D., and Allen, C. (2018). Fire-
aware planning of aerial trajectories and ignitions.
In 2018 IEEE/RSJ International Conference on In-
telligent Robots and Systems (IROS), pages 685–692.
IEEE.
Beamer, S., Asanovic, K., and Patterson, D. (2012).
Direction-optimizing breadth-first search. In SC’12:
Proceedings of the International Conference on High
Performance Computing, Networking, Storage and
Analysis, pages 1–10. IEEE.
Bellman, R. (1966). Dynamic programming. Science,
153(3731):34–37.
Bhattarai, M., Jensen-Curtis, A. R., and Mart
´
ıNez-Ram
´
on,
M. (2020). An embedded deep learning system for
augmented reality in firefighting applications. arXiv
preprint arXiv:2009.10679.
Bhattarai, M. and Mart
´
ıNez-Ram
´
on, M. (2020). A deep
learning framework for detection of targets in ther-
mal images to improve firefighting. IEEE Access,
8:88308–88321.
Goodwin, M., Granmo, O.-C., and Radianti, J. (2015).
Escape planning in realistic fire scenarios with ant
colony optimisation. Applied Intelligence, 42(1):24–
35.
Jarvis, R. A. and Marzouqi, M. S. (2005). Robot path plan-
ning in high risk fire front environments. In TENCON
2005-2005 IEEE Region 10 Conference, pages 1–6.
IEEE.
Kiran, B. R., Sobh, I., Talpaert, V., Mannion, P., Sallab, A.
A. A., Yogamani, S., and P
´
erez, P. (2020). Deep rein-
forcement learning for autonomous driving: A survey.
arXiv preprint arXiv:2002.00444.
LaValle, S. M. (2006). Planning algorithms. Cambridge
university press.
Lei, X., Zhang, Z., and Dong, P. (2018). Dynamic path
planning of unknown environment based on deep re-
inforcement learning. Journal of Robotics, 2018.
Li, Y., Li, C., and Zhang, Z. (2006). Q-learning based
method of adaptive path planning for mobile robot.
In 2006 IEEE international conference on information
acquisition, pages 983–987. IEEE.
Meyes, R., Tercan, H., Roggendorf, S., Thiele, T., B
¨
uscher,
C., Obdenbusch, M., Brecher, C., Jeschke, S., and
Meisen, T. (2017). Motion planning for industrial
robots using reinforcement learning. Procedia CIRP,
63:107–112.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., and Riedmiller, M.
(2013). Playing atari with deep reinforcement learn-
ing. arXiv preprint arXiv:1312.5602.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Ve-
ness, J., Bellemare, M. G., Graves, A., Riedmiller, M.,
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
276