REFERENCES
Bengio, Y., Simard, P., and Frasconi, P. (1994). Learning
long-term dependencies with gradient descent is diffi-
cult. IEEE transactions on neural networks, 5(2):157–
166.
Chicago (2018). Chicago Data Portal: Taxi Trips.
https://data.cityofchicago.org/Transportation/
Taxi-Trips/wrvz-psew.
Cho, K., van Merri
¨
enboer, B., Gulcehre, C., Bahdanau, D.,
Bougares, F., Schwenk, H., and Bengio, Y. (2014).
Learning phrase representations using RNN encoder–
decoder for statistical machine translation. In Mos-
chitti, A., Pang, B., and Daelemans, W., editors, Pro-
ceedings of the 2014 Conference on Empirical Meth-
ods in Natural Language Processing (EMNLP), pages
1724–1734, Doha, Qatar. Association for Computa-
tional Linguistics.
Gammelli, D., Yang, K., Harrison, J., Rodrigues, F.,
Pereira, F. C., and Pavone, M. (2022). Graph
meta-reinforcement learning for transferable au-
tonomous mobility-on-demand. arXiv preprint
arXiv:2202.07147.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term
memory. Neural computation, 9(8):1735–1780.
Konda, V. and Tsitsiklis, J. (1999). Actor-critic algorithms.
Advances in neural information processing systems,
12.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2017). Im-
agenet classification with deep convolutional neural
networks. Communications of the ACM, 60(6):84–90.
Li, J. and Allan, V. H. (2022a). T-balance: A unified mech-
anism for taxi scheduling in a city-scale ride-sharing
service. In ICAART (2), pages 458–465.
Li, J. and Allan, V. H. (2022b). Where to go: Agent guid-
ance with deep reinforcement learning in a city-scale
online ride-hailing service. In 2022 IEEE 25th In-
ternational Conference on Intelligent Transportation
Systems (ITSC), pages 1943–1948. IEEE.
Lin, K., Zhao, R., Xu, Z., and Zhou, J. (2018). Efficient
large-scale fleet management via multi-agent deep re-
inforcement learning. In Proceedings of the 24th
ACM SIGKDD International Conference on Knowl-
edge Discovery & Data Mining, pages 1774–1783.
Lin, L.-J. (1992). Reinforcement learning for robots using
neural networks. Carnegie Mellon University.
Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T.,
Harley, T., Silver, D., and Kavukcuoglu, K. (2016).
Asynchronous methods for deep reinforcement learn-
ing. In International conference on machine learning,
pages 1928–1937. PMLR.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., and Riedmiller, M.
(2013). Playing atari with deep reinforcement learn-
ing. arXiv preprint arXiv:1312.5602.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Ve-
ness, J., Bellemare, M. G., Graves, A., Riedmiller, M.,
Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-
level control through deep reinforcement learning. na-
ture, 518(7540):529–533.
Munkres, J. (1957). Algorithms for the assignment and
transportation problems. Journal of the society for in-
dustrial and applied mathematics, 5(1):32–38.
Sanchez-Lengeling, B., Reif, E., Pearce, A., and Wiltschko,
A. B. (2021). A gentle introduction to graph neural
networks. Distill. https://distill.pub/2021/gnn-intro.
Schaul, T., Quan, J., Antonoglou, I., and Silver, D.
(2015). Prioritized experience replay. arXiv preprint
arXiv:1511.05952.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and
Klimov, O. (2017). Proximal policy optimization al-
gorithms. arXiv preprint arXiv:1707.06347.
Sun, J., Jin, H., Yang, Z., Su, L., and Wang, X. (2022).
Optimizing long-term efficiency and fairness in ride-
hailing via joint order dispatching and driver reposi-
tioning. In Proceedings of the 28th ACM SIGKDD
Conference on Knowledge Discovery and Data Min-
ing, pages 3950–3960.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement learn-
ing: An introduction. MIT press.
Sutton, R. S., McAllester, D., Singh, S., and Mansour, Y.
(1999). Policy gradient methods for reinforcement
learning with function approximation. Advances in
neural information processing systems, 12.
Tesauro, G. et al. (1995). Temporal difference learning and
td-gammon. Communications of the ACM, 38(3):58–
68.
UN, D. (2015). World urbanization prospects: The 2014
revision. United Nations Department of Economics
and Social Affairs, Population Division: New York,
NY, USA, 41.
Van Hasselt, H., Guez, A., and Silver, D. (2016). Deep re-
inforcement learning with double q-learning. In Pro-
ceedings of the Thirtieth AAAI Conference on Artifi-
cial Intelligence, AAAI’16, page 2094–2100.
Wang, G., Zhong, S., Wang, S., Miao, F., Dong, Z., and
Zhang, D. (2021). Data-driven fairness-aware vehi-
cle displacement for large-scale electric taxi fleets. In
2021 IEEE 37th International Conference on Data
Engineering (ICDE), pages 1200–1211. IEEE.
Wang, Z., Qin, Z., Tang, X., Ye, J., and Zhu, H. (2018).
Deep reinforcement learning with knowledge transfer
for online rides order dispatching. In 2018 IEEE Inter-
national Conference on Data Mining (ICDM), pages
617–626. IEEE.
Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M.,
and Freitas, N. (2016). Dueling network architec-
tures for deep reinforcement learning. In International
conference on machine learning, pages 1995–2003.
PMLR.
Wen, J., Zhao, J., and Jaillet, P. (2017). Rebalancing shared
mobility-on-demand systems: A reinforcement learn-
ing approach. In 2017 IEEE 20th international con-
ference on intelligent transportation systems (ITSC),
pages 220–225. Ieee.
Xu, Z., Li, Z., Guan, Q., Zhang, D., Li, Q., Nan, J., Liu,
C., Bian, W., and Ye, J. (2018). Large-scale order dis-
patch in on-demand ride-hailing platforms: A learning
and planning approach. In Proceedings of the 24th
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
208