
segmenting tree canopies and trunks in street-view
images. The approach integrated attention mecha-
nisms and joint convolutions with atrous spatial pyra-
mid pooling into the U-Net architecture. Joint con-
volutions enhanced convergence, showing competi-
tive semantic segmentation results while significantly
reducing network parameters compared to baseline
methods.
Future studies will be conducted on assembling
larger and richer datasets to capitalize on research in
tree structure segmentation, enhance the model’s ca-
pacity, and further propose a computer-aided method
to aid and accelerate the practices on tree structural
analysis.
ACKNOWLEDGEMENTS
This study was financed, in part, by the S
˜
ao Paulo Re-
search Foundation (FAPESP), Brasil. Process Num-
bers #2013/07375-0, #2014/12236-1, #2019/07665-4,
#2019/18287-0, #2023/10823-6, and #2023/14427-8.
The authors also thank CNPq grants 308529/2021-9
and 400756/2024-2, and Petrobras grant 2023/00466-
1.
This work was jointly supported by the Office of
Naval Research (ONR) with grant No. N62909-24-
1-2012 and by the Air Force Office of Scientific Re-
search (AFOSR).
REFERENCES
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and
Yuille, A. L. (2017). Deeplab: Semantic image seg-
mentation with deep convolutional nets, atrous con-
volution, and fully connected CRFs. IEEE transac-
tions on pattern analysis and machine intelligence,
40(4):834–848.
Chollet, F. (2017). Xception: Deep learning with depthwise
separable convolutions. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion, pages 1251–1258.
de Lima Ara
´
ujo, H. C., Martins, F. S., Cortese, T. T. P.,
and Locosselli, G. M. (2021). Artificial intelligence in
urban forestry—A systematic review. Urban Forestry
& Urban Greening, 66:127410.
Deluzet, M., Erudel, T., Briottet, X., Sheeren, D., and Fabre,
S. (2022). Individual Tree Crown Delineation Method
Based on Multi-Criteria Graph Using Geometric and
Spectral Information: Application to Several Temper-
ate Forest Sites. Remote Sensing, 14(5):1083.
Guo, M.-H., Xu, T.-X., Liu, J.-J., Liu, Z.-N., Jiang, P.-T.,
Mu, T.-J., Zhang, S.-H., Martin, R. R., Cheng, M.-M.,
and Hu, S.-M. (2022). Attention mechanisms in com-
puter vision: A survey. Computational Visual Media,
pages 1–38.
Jodas, D. S., Brazolin, S., Yojo, T., De Lima, R. A., Velasco,
G. D. N., Machado, A. R., and Papa, J. P. (2021). A
Deep Learning-based Approach for Tree Trunk Seg-
mentation. In 2021 34th SIBGRAPI Conference on
Graphics, Patterns and Images (SIBGRAPI), pages
370–377. IEEE.
Jodas, D. S., Passos, L. A., Velasco, G. D. N., Longo, M.
H. C., Machado, A. R., and Papa, J. P. (2022a). Multi-
class Oversampling via Optimum-Path Forest for Tree
Species Classification from Street-view Perspectives.
In To appear in 35th Conference on Graphics, Pat-
terns and Images (SIBGRAPI), pages 1–6. IEEE.
Jodas, D. S., Velasco, G. D. N., de Lima, R. A., Machado,
A. R., and Papa, J. P. (2023). Deep learning seman-
tic segmentation models for detecting the tree crown
foliage. In Radeva, P., Farinella, G. M., and Boua-
touch, K., editors, Proceedings of the 18th Interna-
tional Joint Conference on Computer Vision, Imag-
ing and Computer Graphics Theory and Applica-
tions, VISIGRAPP 2023, Volume 4: VISAPP, Lis-
bon, Portugal, February 19-21, 2023, pages 143–150.
SCITEPRESS.
Jodas, D. S., Yojo, T., Brazolin, S., Velasco, G. D. N., and
Papa, J. P. (2022b). Detection of Trees on Street-View
Images Using a Convolutional Neural Network. Inter-
national Journal of Neural Systems, 32(01):2150042.
Khan, S., Naseer, M., Hayat, M., Zamir, S. W., Khan, F. S.,
and Shah, M. (2021). Transformers in vision: A sur-
vey. ACM Computing Surveys (CSUR).
Kingma, D. P. and Ba, J. (2014). Adam: A
method for stochastic optimization. arXiv preprint
arXiv:1412.6980.
Loesdau, M., Chabrier, S., and Gabillon, A. (2017). Chro-
matic Indices in the Normalized rgb Color Space.
In 2017 International Conference on Digital Image
Computing: Techniques and Applications (DICTA),
pages 1–8.
Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net:
Convolutional networks for biomedical image seg-
mentation. In International Conference on Medical
image computing and computer-assisted intervention,
pages 234–241. Springer.
Woo, S., Park, J., Lee, J.-Y., and Kweon, I. S. (2018).
CBAM: Convolutional Block Attention Module. In
Proceedings of the European Conference on Com-
puter Vision (ECCV), pages 3–19.
Xu, R. and Wunsch, D. (2005). Survey of clustering al-
gorithms. IEEE Transactions on Neural Networks,
16(3):645–678.
Zhao, H., Morgenroth, J., Pearse, G., and Schindler, J.
(2023). A systematic review of individual tree crown
detection and delineation with convolutional neural
networks (cnn). Current Forestry Reports, pages 1–
22.
Zhou, Y., Wang, L., Jiang, K., Xue, L., An, F., Chen, B.,
and Yun, T. (2020). Individual tree crown segmenta-
tion based on aerial image using superpixel and topo-
logical features. Journal of Applied Remote Sensing,
14(2):022210.
VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications
274