
BERT to align it better with this assembly documen-
tation task. This innovative approach sets a new stan-
dard in the area.
Future work on this research could include im-
proving the performance of the model by training it on
a larger dataset. Another area of improvement could
be expanding the system to overcome compiler opti-
mization challenges in generating comments for sim-
ilar functions that are compiled with different compil-
ers/optimization levels. In addition, the system could
be modified to generate multi-sentence summaries,
instead of just one-sentence comments.
ACKNOWLEDGMENT
This research is supported by NSERC Alliance Grants
(ALLRP 561035-20), BlackBerry Limited, and De-
fence Research and Development Canada (DRDC).
REFERENCES
Ahmad, W. U., Chakraborty, S., Ray, B., and Chang, K.-
W. (2020). A transformer-based approach for source
code summarization. In Proceedings of the 58th An-
nual Meeting of the Association for Computational
Linguistics, page 4998–500.
Allamanis, M., Peng, H., and Sutton, C. (2016). A convo-
lutional attention network for extreme summarization
of source code. In Proceedings of the International
Conference on Machine Learning, page 2091–2100.
Bahdanau, D., Cho, K., and Bengio, Y. (2014). Neural ma-
chine translation by jointly learning to align and trans-
late. ArXiv, 1409.
Barone, A. V. M. and Sennrich, R. (2021). A parallel cor-
pus of python functions and documentation strings for
automated code documentation and code generation.
In Proceedings of the Eighth International Joint Con-
ference on Natural Language Processing, 2:314–319.
Bellon, S., Koschke, R., Antoniol, G., Krinke, J., and
Merlo, E. (2007). Comparison and evaluation of clone
detection tools. IEEE Transactions on software engi-
neering, 33(9):577–591.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D.,
Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G.,
Askell, A., Agarwal, S., Herbert-Voss, A., Krueger,
G., Henighan, T., Child, R., Ramesh, A., Ziegler, D.,
Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E.,
Litwin, M., Gray, S., Chess, B., Clark, J., Berner,
C., McCandlish, S., Radford, A., Sutskever, I., and
Amodei, D. (2020). Language models are few-shot
learners. In Advances in Neural Information Process-
ing Systems, (33):1877–1901.
Cao, Y., Bi, W., Fang, M., and Tao, D. (2020). Pretrained
language models for dialogue generation with multi-
ple input sources. pages 909–917.
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D.,
Bougares, F., Schwenk, H., and Bengio, Y. (2014).
Learning phrase representations using rnn encoder-
decoder for statistical machine translation.
Christian, H., Agus, M. P., and Suhartono, D. (2016).
Single document automatic text summarization using
term frequency-inverse document frequency (tf-idf).
ComTech: Computer, Mathematics and Engineering
Applications, 7:285.
Clement, C., Drain, D., Timcheck, J., Svyatkovskiy, A., and
Sundaresan, N. (2020). Pymt5: Multi-mode transla-
tion of natural language and python code with trans-
formers. In Proceedings of the 2020 Conference on
Empirical Methods in Natural Language Processing.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2019). Bert: Pre-training of deep bidirectional trans-
formers for language understanding. In Proceedings
of the 2019 Conference of the North American Chap-
ter of the Association for Computational Linguistics:
Human Language Technologies, page 4171–4186.
Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong,
M., Shou, L., Qin, B., Liu, T., Jiang, D., and Zhou,
M. (2020). Code-bert: A pre-trained model for pro-
gramming and natural languages. In Findings of the
Association for Computational Linguistics: EMNLP
2020, page 1536–1547.
Gao, P., Geng, S., Qiao, Y., Wang, X., Dai, J., and Li,
H. (2021). Scalable transformers for neural machine
translation.
Gu, X., Zhang, H., and Kim, S. (2018). Deep code
search. In Proceedings of IEEE/ACM 40th Interna-
tional Conference on Software Engineering (ICSE),
page 933–944.
Gupta, A., Chugh, D., Anjum, and Katarya, R. (2022).
Automated news summarization using transformers.
pages 249–259.
Hu, X., Li, G., Xia, X., Lo, D., and Jin, Z. (2018a). Deep
code comment generation. In Proceedings of the
26th Conference on Program Comprehension, page
200–210.
Hu, X., Li, G., Xia, X., Lo, D., Lu, S., and Jin, Z. (2018b).
Summarizing source code with transferred api knowl-
edge. In Proceedings of the Twenty-Seventh Interna-
tional Joint Conference on Artificial Intelligence Main
track, pages 2269–2275.
Husain, H., Wu, H.-H., Gazit, T., Allamanis, M., and
Brockschmidt, M. Codesearchnet challenge: Evalu-
ating the state of semantic code search.
Iyer, S., Konstas, I., Cheung, A., and Zettlemoyer, L.
(2016). Summarizing source code using a neural at-
tention model. In Proceedings of the 54th Annual
Meeting of the Association for Computational Lin-
guistics, 1:2073–2083.
Kusupati, U. and Ailavarapu, V. R. T. (2022). Natural lan-
guage to code using transformers.
LeClair, A., Jiang, S., and McMillan, C. (2019). A neural
model for generating natural language summaries of
program subroutines. In Proceedings of the 41st Inter-
national Conference on Software Engineering, page
795–806.
ICSOFT 2024 - 19th International Conference on Software Technologies
44