Chen, K., Wang, J., Pang, and et. al. (2019b). MMDetec-
tion: Open MMLab Detection Toolbox and Bench-
mark. arXiv:1906.07155.
Dong, Z., Li, G., Liao, Y., Wang, F., Ren, P., and Qian, C.
(2020). CentripetalNet: Pursuing High-Quality Key-
point Pairs for Object Detection. In IEEE CVPR.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn,
J., and Zisserman, A. (2012). The PASCAL Visual
Object Classes Challenge 2012 (VOC2012) Results.
Fang, H.-S., Sun, J., Wang, R., Gou, M., Li, Y.-L., and Lu,
C. (2019). Instaboost: Boosting Instance Segmen-
tation via Probability Map Guided Copy-Pasting. In
IEEE ICCV.
Fedus, W., Zoph, B., and Shazeer, N. (2021). Switch Trans-
formers: Scaling to Trillion Parameter Models with
Simple and Efficient Sparsity. arXiv:2101.03961.
Girshick, R. (2015). Fast R-CNN. In 2015 IEEE ICCV.
Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014).
Rich Feature Hierarchies for Accurate Object Detec-
tion and Semantic Segmentation. In 2014 CVPR.
Google Static Maps (2021). Google Maps JavaScript API
Version 3 Reference.
Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup,
D., and Meger, D. (2018). Deep Reinforcement Learn-
ing That Matters. In AAAI.
Hsieh, M.-R., Lin, Y.-L., and Hsu, W. H. (2017). Drone-
based Object Counting by Spatially Regularized Re-
gional Proposal Networks. In IEEE ICCV.
Jiao, L. and Zhao, J. (2019). A Survey on the New Gen-
eration of Deep Learning in Image Processing. IEEE
Access.
Kumar, R., Dabral, R., and Sivakumar, G. (2021). Learning
Unsupervised Cross-domain Image-to-Image Transla-
tion using a Shared Discriminator. In Vol. 4: VISAPP.
Law, H. and Deng, J. (2018). Cornernet: Detecting objects
as Paired Keypoints. In 15th ECCV 2018.
Li, K., Wan, G., Cheng, G., Meng, L., and Han, J. (2020a).
Object Detection in Optical Remote Sensing Images:
A Survey and a New Benchmark. ISPRS.
Li, X., Yang, J., and et. al (2020b). Generalized Focal Loss:
Learning Qualified and Distributed Bounding Boxes
for Dense Object Detection. arXiv:2006.04388.
Lin, T., Maire, M., Zitnick, C. L., and et. al. (2014). Mi-
crosoft COCO: common objects in context. CoRR,
abs/1405.0312.
Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Doll
´
ar, P.
(2017). Focal Loss for Dense Object Detection. In
IEEE ICCV.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.,
Fu, C.-Y., and Berg, A. C. (2016). SSD: Single Shot
MultiBox Detector. In ECCV 2016.
Liu, Z., Guo, B., and et. al (2021). Swin Transformer: Hier-
archical Vision Transformer using Shifted Windows.
CoRR, abs/2103.14030.
Long, Y., Gong, Y., Xiao, Z., and Liu, Q. (2017). Accurate
Object Localization in Remote Sensing Images Based
on Convolutional Neural Networks. IEEE TGRS.
Maggiori, E., Tarabalka, Y., Charpiat, G., and Alliez, P.
(2017). Can Semantic Labeling Methods Generalize
to Any City? The Inria Aerial Image Labeling Bench-
mark. In IEEE IGARSS.
Qiao, S., Chen, L., and Yuille, A. (2020). DetectoRS: De-
tecting Objects with Recursive Feature Pyramid and
Switchable Atrous Convolution. arXiv:2006.02334.
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A.
(2016). You Only Look Once: Unified, Real-Time
Object Detection. In 2016 IEEE CVPR.
Redmon, J. and Farhadi, A. (2017). YOLO9000: Better,
Faster, Stronger. In 2017 IEEE CVPR.
Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental
improvement. CoRR, abs/1804.02767.
Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster R-
CNN: Towards Real-Time Object Detection with Re-
gion Proposal Networks. In NeurIPS, volume 28.
Rossi, L., Karimi, A., and Prati, A. (2020). A novel region
of interest extraction layer for instance segmentation.
CoRR, abs/2004.13665.
Russakovsky, O., Fei-Fei, L., and et. al (2015). Imagenet
large scale visual recognition challenge. IJCV.
Shermeyer, J., Hossler, T., Van Etten, A., Hogan, D.,
Lewis, R., and Kim, D. (2020). RarePlanes: Syn-
thetic Data Takes Flight. In-Q-Tel - CosmiQ Works
and AI.Reverie.
Sun, P., Luo, P., and et. al (2020). SparseR-CNN: End-
to-End Object Detection with Learnable Proposals.
arXiv:2011.12450.
Tian, Z., Shen, C., Chen, H., and He, T. (2019). FCOS:
Fully Convolutional One-Stage Object Detection.
arXiv:1904.01355.
Tundia, C., Tank, P., and Damani, O. (2020). Aiding Irri-
gation Census in Developing Countries by Detecting
Minor Irrigation Structures from Satellite Imagery. In
6th Inter. Conf. on GISTAM.
Vu, T., Jang, H., Pham, T. X., and Yoo, C. D. (2019).
Cascade RPN: Delving into High-Quality Region
Proposal Network with Adaptive Convolution. In
NeurIPS.
Weir, N., Lindenbaum, D., Bastidas, A., Etten, A., Kumar,
V., Mcpherson, S., Shermeyer, J., and Tang, H. (2019).
SpaceNet MVOI: A Multi-View Overhead Imagery
Dataset. In 2019 IEEE/CVF ICCV.
Yang, D. M. (2018). ITCVD Dataset. DANS. Faculty of
GIS and Earth Observation (ITC).
Zhang, H., Chang, H., Ma, B., Wang, N., and Chen,
X. (2020). Dynamic R-CNN: Towards High
Quality Object Detection via Dynamic Training.
arXiv:2004.06002.
Zhang, X., Wan, F., Liu, C., Ji, R., and Ye, Q. (2019).
FreeAnchor: Learning to Match Anchors for Visual
Object Detection. In NeurIPS.
Zheng, Z., Ye, R., Wang, P., Wang, J., Ren, D., and Zuo, W.
(2021). Localization Distillation for Object Detection.
arXiv:2102.12252.
Zhu, X., Cheng, D., Zhang, Z., Lin, S., and Dai, J. (2019).
An Empirical Study of Spatial Attention Mechanisms
in Deep Networks. arXiv:1904.05873.
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020).
Deformable DETR: deformable transformers for end-
to-end object detection. CoRR, abs/2010.04159.
VISAPP 2022 - 17th International Conference on Computer Vision Theory and Applications
330