
REFERENCES
Acharya, A., Singh, B., and Onoe, N. (2023). Llm based
generation of item-description for recommendation
system. In Proceedings of the 17th ACM Conference
on Recommender Systems, pages 1204–1207.
Amari, S.-i. (1993). Backpropagation and stochastic gra-
dient descent method. Neurocomputing, 5(4-5):185–
196.
Angwin, J., Larson, J., Mattu, S., and Kirchner, L. (2016).
Machine bias. ProPublica.
Azamfirei, R., Kudchadkar, S. R., and Fackler, J. (2023).
Large language models and the perils of their halluci-
nations. Critical Care, 27(1):120.
Bertsimas, D. and Tsitsiklis, J. (1993). Simulated anneal-
ing. Statistical science, 8(1):10–15.
Dubey, A. et al. (2024). The llama 3 herd of models. arXiv
preprint arXiv:2407.21783.
Eckhouse, L., Lum, K., Conti-Cook, C., and Ciccolini, J.
(2019). Layers of bias: A unified approach for un-
derstanding problems with risk assessment. Criminal
Justice and Behavior, 46(2):185–209.
Gao, L., Madaan, A., Zhou, S., Alon, U., Liu, P., Yang,
Y., Callan, J., and Neubig, G. (2023). Pal: Program-
aided language models. In International Conference
on Machine Learning, pages 10764–10799. PMLR.
Gerszberg, N. R. (2024). Quantifying Gender Bias in Large
Language Models: When ChatGPT Becomes a Hir-
ing Manager. PhD thesis, Massachusetts Institute of
Technology.
Gopalakrishna, S. T. et al. (2019). Automated tool for re-
sume classification using sementic analysis. Interna-
tional Journal of Artificial Intelligence and Applica-
tions (IJAIA), 10(1).
Hao, K. (2019). Ai is sending people to jail — and getting
it wrong. MIT Technology Review.
Huang, J., Galal, G., Etemadi, M., and Vaidyanathan, M.
(2022). Evaluation and mitigation of racial bias in
clinical machine learning models: scoping review.
JMIR Medical Informatics, 10(5):e36388.
Kasneci, E. et al. (2023). Chatgpt for good? on op-
portunities and challenges of large language models
for education. Learning and individual differences,
103:102274.
Kenton, J. D. et al. (2019). Bert: Pre-training of deep bidi-
rectional transformers for language understanding. In
Proceedings of naacL-HLT, volume 1, page 2. Min-
neapolis, Minnesota.
Laban, P., Kry
´
sci
´
nski, W., Agarwal, D., Fabbri, A. R.,
Xiong, C., Joty, S., and Wu, C.-S. (2023). Summedits:
measuring llm ability at factual reasoning through the
lens of summarization. In Proceedings of the 2023
Conference on Empirical Methods in Natural Lan-
guage Processing, pages 9662–9676.
Leavy, S. (2018). Gender bias in artificial intelligence: The
need for diversity and gender theory in machine learn-
ing. In Proceedings of the 1st international workshop
on gender equality in software engineering, pages 14–
16.
Li, H., Gao, H., Wu, C., and Vasarhelyi, M. A. (2024a).
Extracting financial data from unstructured sources:
Leveraging large language models. Journal of Infor-
mation Systems, pages 1–22.
Li, Y., Wen, H., Wang, W., Li, X., Yuan, Y., Liu, G., Liu, J.,
Xu, W., Wang, X., Sun, Y., et al. (2024b). Personal llm
agents: Insights and survey about the capability, effi-
ciency and security. arXiv preprint arXiv:2401.05459.
Lin, J. (1991). Divergence measures based on the shannon
entropy. IEEE Transactions on Information theory,
37(1):145–151.
Minaee, S., Mikolov, T., Nikzad, N., Chenaghlu, M.,
Socher, R., Amatriain, X., and Gao, J. (2024).
Large language models: A survey. arXiv preprint
arXiv:2402.06196.
Ouyang, S., Zhang, J. M., Harman, M., and Wang, M.
(2023). Llm is like a box of chocolates: the non-
determinism of chatgpt in code generation. arXiv
preprint arXiv:2308.02828.
Price, S. and Neamtu, R. (2022). Identifying, evaluating,
and addressing nondeterminism in mask r-cnns. In
International Conference on Pattern Recognition and
Artificial Intelligence, pages 3–14. Springer.
Renze, M. and Guven, E. (2024). The effect of sam-
pling temperature on problem solving in large lan-
guage models. arXiv preprint arXiv:2402.05201.
Roumeliotis, K. I. and Tselikas, N. D. (2023). Chatgpt and
open-ai models: A preliminary review. Future Inter-
net, 15(6):192.
Saha, D., Tarek, S., Yahyaei, K., Saha, S. K., Zhou, J.,
Tehranipoor, M., and Farahmandi, F. (2024). Llm for
soc security: A paradigm shift. IEEE Access.
Schwartz, O. (2019). Untold history of ai: Algorithmic bias
was born in the 1980s. IEEE Spectrum.
Singh, V. (2023). Exploring the role of large language
model (llm)-based chatbots for human resources. PhD
thesis, University of Texas at Austin.
Song, Y., Wang, G., Li, S., and Lin, B. Y. (2024). The
good, the bad, and the greedy: Evaluation of llms
should not ignore non-determinism. arXiv preprint
arXiv:2407.10457.
Vaswani, A. et al. (2017). Attention is all you need. Ad-
vances in Neural Information Processing Systems.
Wang, Y., Lipka, N., Rossi, R. A., Siu, A., Zhang, R.,
and Derr, T. (2024). Knowledge graph prompting for
multi-document question answering. In Proceedings
of the AAAI Conference on Artificial Intelligence, vol-
ume 38, pages 19206–19214.
Wilson, K. and Caliskan, A. (2024). Gender, race, and inter-
sectional bias in resume screening via language model
retrieval. In Proceedings of the AAAI/ACM Confer-
ence on AI, Ethics, and Society, volume 7, pages
1578–1590.
ICPRAM 2025 - 14th International Conference on Pattern Recognition Applications and Methods
214