
level open-source large language models for informa-
tion extraction from digitized documents. In Anais
do XII Symposium on Knowledge Discovery, Mining
and Learning, pages 25–32, Porto Alegre, RS, Brasil.
SBC.
Altman, D. G. and Bland, J. M. (2005). Standard deviations
and standard errors. Bmj, 331(7521):903.
Bomfim, R., Pei, S., Shaman, J., Yamana, T., Makse, H. A.,
Andrade Jr, J. S., Lima Neto, A. S., and Furtado, V.
(2020). Predicting dengue outbreaks at neighbour-
hood level using human mobility in urban areas. Jour-
nal of the Royal Society Interface, 17(171):20200691.
Breiman, L. (2001). Random forests. Machine learning,
45:5–32.
Caminha, C. and Furtado, V. (2017). Impact of human mo-
bility on police allocation. In 2017 IEEE International
Conference on Intelligence and Security Informatics
(ISI), pages 125–127. IEEE.
Chu, C.-S. J. (1995). Time series segmentation: A slid-
ing window approach. Information Sciences, 85(1-
3):147–173.
Freitas, J. D., Ponte, C., Bomfim, R., and Caminha, C.
(2023). The impact of window size on univariate time
series forecasting using machine learning. In Anais do
XI Symposium on Knowledge Discovery, Mining and
Learning, pages 65–72. SBC.
Goel, A., Gueta, A., Gilon, O., Liu, C., Erell, S., Nguyen,
L. H., Hao, X., Jaber, B., Reddy, S., Kartha, R., et al.
(2023). Llms accelerate annotation for medical infor-
mation extraction. In Machine Learning for Health
(ML4H), pages 82–100. PMLR.
Gu, Q. (2023). Llm-based code generation method for
golang compiler testing. In Proceedings of the 31st
ACM Joint European Software Engineering Confer-
ence and Symposium on the Foundations of Software
Engineering, pages 2201–2203.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term
memory. Neural computation, 9(8):1735–1780.
Jin, M., Wang, S., Ma, L., Chu, Z., Zhang, J. Y., Shi,
X., Chen, P.-Y., Liang, Y., Li, Y.-F., Pan, S., et al.
(2023a). Time-llm: Time series forecasting by re-
programming large language models. arXiv preprint
arXiv:2310.01728.
Jin, M., Wen, Q., Liang, Y., Zhang, C., Xue, S., Wang,
X., Zhang, J., Wang, Y., Chen, H., Li, X., et al.
(2023b). Large models for time series and spatio-
temporal data: A survey and outlook. arXiv preprint
arXiv:2310.10196.
Kane, M. J., Price, N., Scotch, M., and Rabinowitz, P.
(2014). Comparison of arima and random forest time
series models for prediction of avian influenza h5n1
outbreaks. BMC bioinformatics, 15:1–9.
Karl, A., Fernandes, G., Pires, L., Serpa, Y., and Cam-
inha, C. (2024). Synthetic ai data pipeline for domain-
specific speech-to-text solutions. In Anais do XV Sim-
pósio Brasileiro de Tecnologia da Informação e da
Linguagem Humana, pages 37–47, Porto Alegre, RS,
Brasil. SBC.
Kreinovich, V., Nguyen, H. T., and Ouncharoen, R. (2014).
How to estimate forecasting quality: A system-
motivated derivation of symmetric mean absolute per-
centage error (smape) and other similar characteris-
tics. Departmental Technical Reports (CS).
Lim, B. and Zohren, S. (2021). Time-series forecasting with
deep learning: a survey. Philosophical Transactions of
the Royal Society A, 379(2194):20200209.
Liu, C., Xu, Q., Miao, H., Yang, S., Zhang, L., Long, C.,
Li, Z., and Zhao, R. (2024a). Timecma: Towards llm-
empowered time series forecasting via cross-modality
alignment. arXiv preprint arXiv:2406.01638.
Liu, C., Yang, S., Xu, Q., Li, Z., Long, C., Li, Z., and Zhao,
R. (2024b). Spatial-temporal large language model for
traffic prediction. arXiv preprint arXiv:2401.10134.
Makridakis, S. (1993). Accuracy measures: theoretical and
practical concerns. International journal of forecast-
ing, 9(4):527–529.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,
Weiss, R., Dubourg, V., et al. (2011). Scikit-learn:
Machine learning in python. the Journal of machine
Learning research, 12:2825–2830.
Ponte, C., Carmona, H. A., Oliveira, E. A., Caminha, C.,
Lima, A. S., Andrade Jr, J. S., and Furtado, V. (2021).
Tracing contacts to evaluate the transmission of covid-
19 from highly exposed individuals in public trans-
portation. Scientific Reports, 11(1):24443.
Ponte, C., Melo, H. P. M., Caminha, C., Andrade Jr, J. S.,
and Furtado, V. (2018). Traveling heterogeneity in
public transportation. EPJ Data Science, 7(1):1–10.
Sagheer, A. and Kotb, M. (2019). Time series forecasting of
petroleum production using deep lstm recurrent net-
works. Neurocomputing, 323:203–213.
Silva, M., Mendonça, A. L., Neto, E. D., Chaves, I., Cam-
inha, C., Brito, F., Farias, V., and Machado, J. (2024).
Facto dataset: A dataset of user reports for faulty com-
puter components. In Anais do VI Dataset Showcase
Workshop, pages 91–102, Porto Alegre, RS, Brasil.
SBC.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.
(2017). Attention is all you need. Advances in neural
information processing systems, 30.
Wang, W., Chen, Z., Chen, X., Wu, J., Zhu, X., Zeng, G.,
Luo, P., Lu, T., Zhou, J., Qiao, Y., et al. (2024). Vi-
sionllm: Large language model is also an open-ended
decoder for vision-centric tasks. Advances in Neural
Information Processing Systems, 36.
Zeng, A., Chen, M., Zhang, L., and Xu, Q. (2023). Are
transformers effective for time series forecasting? In
Proceedings of the AAAI conference on artificial intel-
ligence, volume 37, pages 11121–11128.
ICEIS 2025 - 27th International Conference on Enterprise Information Systems
316