International Conference on Mining Software Reposi-
tories (MSR), pages 712–716.
Dinella, E., Dai, H., Brain, G., Li, Z., Naik, M., Song,
L., Tech, G., and Wang, K. (2020). Hoppity: Learn-
ing Graph Transformations To Detect and Fix Bugs in
Programs. Technical report.
Drain, D., Wu, C., Svyatkovskiy, A., and Sundaresan, N.
(2021). Generating bug-fixes using pretrained trans-
formers. MAPS 2021 - Proceedings of the 5th ACM
SIGPLAN International Symposium on Machine Pro-
gramming, co-located with PLDI 2021, pages 1–8.
Elkins, K. and Chun, J. (2020). Can GPT-3 Pass a Writer’s
Turing Test? Journal of Cultural Analytics.
Hasan, M., Mehrab, K. S., Ahmad, W. U., and Shahriyar,
R. (2021). Text2App: A Framework for Creating An-
droid Apps from Text Descriptions.
Just, R., Jalali, D., and Ernst, M. D. (2014). Defects4J: A
database of existing faults to enable controlled test-
ing studies for Java programs. In 2014 International
Symposium on Software Testing and Analysis, ISSTA
2014 - Proceedings, pages 437–440. Association for
Computing Machinery, Inc.
Kechagia, M., Mechtaev, S., Sarro, F., and Harman, M.
(2022). Evaluating automatic program repair capa-
bilities to repair api misuses. IEEE Transactions on
Software Engineering, 48(7):2658–2679.
Lajkó, M., Csuvik, V., and Vidács, L. (2022). To-
wards javascript program repair with generative pre-
trained transformer (gpt-2). In 2022 IEEE/ACM In-
ternational Workshop on Automated Program Repair
(APR), pages 61–68.
Lajkó, M., Horváth, D., Csuvik, V., and Vidács, L. (2022).
Fine-tuning gpt-2 to patch programs, is it worth it? In
Gervasi, O., Murgante, B., Misra, S., Rocha, A. M.
A. C., and Garau, C., editors, Computational Science
and Its Applications – ICCSA 2022 Workshops, pages
79–91, Cham. Springer International Publishing.
Le Goues, C., Holtschulte, N., Smith, E. K., Brun, Y., De-
vanbu, P., Forrest, S., and Weimer, W. (2015). The
ManyBugs and IntroClass Benchmarks for Automated
Repair of C Programs. IEEE Transactions on Software
Engineering, 41(12):1236–1256.
Lin, D., Koppel, J., Chen, A., and Solar-Lezama, A. (2017).
Quixbugs: A multi-lingual program repair benchmark
set based on the quixey challenge. In Proceedings
Companion of the 2017 ACM SIGPLAN International
Conference on Systems, Programming, Languages,
and Applications: Software for Humanity, SPLASH
Companion 2017, page 55–56, New York, NY, USA.
Association for Computing Machinery.
Liu, H., Shen, M., Zhu, J., Niu, N., Li, G., and Zhang,
L. (2020). Deep Learning Based Program Generation
from Requirements Text: Are We There Yet? IEEE
Transactions on Software Engineering, pages 1–1.
Lu, S., Guo, D., Ren, S., Huang, J., Svyatkovskiy, A.,
Blanco, A., Clement, C., Drain, D., Jiang, D., Tang,
D., Li, G., Zhou, L., Shou, L., Zhou, L., Tufano, M.,
Gong, M., Zhou, M., Duan, N., Sundaresan, N., Deng,
S. K., Fu, S., and Liu, S. (2021). CodeXGLUE: A
Machine Learning Benchmark Dataset for Code Un-
derstanding and Generation. undefined.
Lutellier, T., Pham, H. V., Pang, L., Li, Y., Wei, M., and
Tan, L. (2020). CoCoNuT: Combining context-aware
neural translation models using ensemble for program
repair. ISSTA 2020 - Proceedings of the 29th ACM
SIGSOFT International Symposium on Software Test-
ing and Analysis, 20:101–114.
Martinez, M., Durieux, T., Sommerard, R., Xuan, J.,
and Monperrus, M. (2017). Automatic repair of
real bugs in java: a large-scale experiment on the
defects4j dataset. Empirical Software Engineering,
22(4):1936–1964.
Mastropaolo, A., Scalabrino, S., Cooper, N., Nader Pala-
cio, D., Poshyvanyk, D., Oliveto, R., and Bavota, G.
(2021). Studying the usage of text-to-text transfer
transformer to support code-related tasks. Proceed-
ings - International Conference on Software Engineer-
ing, pages 336–347.
Monperrus, M. (2020). The Living Review on Automated
Program Repair. Technical report.
OpenAI (2023). Gpt-4 technical report.
OpenAI ChatGPT (2023a). Openai chatgpt. https://openai.
com/blog/chatgpt/.
OpenAI ChatGPT (2023b). Openai chatgpt app. https://
chat.openai.com/.
Prenner, J. A., Babii, H., and Robbes, R. (2022). Can Ope-
nAI’s Codex Fix Bugs?: An evaluation on QuixBugs.
Proceedings - International Workshop on Automated
Program Repair, APR 2022, pages 69–75.
Saha, R. K., Lyu, Y., Lam, W., Yoshida, H., and Prasad,
M. R. (2018). Bugs.jar: A large-scale, diverse dataset
of real-world Java bugs. Proceedings - International
Conference on Software Engineering, pages 10–13.
Tufano, M., Pantiuchina, J., Watson, C., Bavota, G., and
Poshyvanyk, D. (2019). On learning meaningful code
changes via neural machine translation. In Proceed-
ings of the 41st International Conference on Software
Engineering, ICSE ’19, page 25–36. IEEE Press.
Weimer, W., Nguyen, T., Le Goues, C., and Forrest, S.
(2009). Automatically finding patches using genetic
programming. In Proceedings of the 31st Interna-
tional Conference on Software Engineering, ICSE
’09, page 364–374, USA. IEEE Computer Society.
Ye, H., Martinez, M., Luo, X., Zhang, T., and Monperrus,
M. (2022). SelfAPR: Self-supervised Program Repair
with Test Execution Diagnostics.
Yi, L., Wang, S., and Nguyen, T. N. (2020). Dlfix: Context-
based code transformation learning for automated pro-
gram repair. In Proceedings - International Confer-
ence on Software Engineering, pages 602–614. IEEE
Computer Society.
Zhao, T. Z., Wallace, E., Feng, S., Klein, D., and Singh, S.
(2021). Calibrate Before Use: Improving Few-Shot
Performance of Language Models.
Can ChatGPT Fix My Code?
485