Investigating the Impact of Code Generation Tools (ChatGPT & Github

CoPilot) on Programming Education

Faisal Nizamudeen, Lorenzo Gatti, Nacir Bouali and Faizan Ahmed

Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, Enschede, The Netherlands

Keywords:

ChatGPT, Artiﬁcial Intelligence, Code Generation, Computer Science Education.

Abstract:

In our rapidly evolving technological landscape, AI tools have gained substantial power and integration across

various domains. Through interviews and surveys conducted at a University in the Netherlands, we inves-

tigated students’ perceptions of AI tools. Our results show that students generally have a positive attitude

towards the adoption of AI technologies and feel that it enhances their learning experience. Furthermore, this

research project examines the capabilities of AI-powered tools, namely GitHub Copilot and ChatGPT, in solv-

ing a variety of university-level assignments. By empirically evaluating the capabilities of these AI tools and

offering insights to educators, this research project aims to assist them in designing programming exercises

that encompass essential learning processes while accounting for students’ utilization of AI tools. The ﬁndings

indicate that a majority of the exercises currently utilized by the examined university could be solved partially

or entirely with the aid of these tools. This project highlights the importance of educators understanding the

capabilities of AI tools, as well as students’ attitudes towards them, to effectively adapt their teaching methods

and promote essential learning goals.

1 INTRODUCTION

The past year has seen a signiﬁcant surge in AI in-

novation, with AI tools becoming increasingly inte-

grated into people’s daily lives worldwide. This trend

is especially noticeable in the ﬁeld of programming.

Code generation tools are models that can auto-

matically generate source code or actual pieces of

code based on input speciﬁcations, templates, or mod-

els (Herrington, 2003). They are accessible via plu-

gins in integrated development environments (IDEs)

and free-access websites that provide interaction with

such generators. With their dramatic improvement

over the past year, it is believed that they have the po-

tential to revolutionize the coding process and make

programming more productive and accessible (Li

et al., 2022).

AI Chatbots and AI Pair Programmers are the

most commonly used tools for code generation.

Among these tools, ChatGPT (an AI Chatbot) and

GitHub Copilot (an AI Pair Programmer) stand out

as widely used applications. These technologies have

notably diminished the reliance on platforms like

Stack Overﬂow as these tools can offer personalized

responses and better contextual understanding. As

AI tools continue to advance, the importance of such

sites will likely continue to diminish.

ChatGPT, created by OpenAI, is an AI chat-

bot that uses OpenAI’s GPT model to simulate

human-like responses when interacting with users.

The model was trained using Reinforcement Learn-

ing from Human Feedback (RLHF), and a reward

model was established based on conversations be-

tween trainers and the chatbot. While the potential

applications of ChatGPT are virtually limitless, its

implications for programmers are particularly signiﬁ-

cant. Within the programming context, programmers

can rely on ChatGPT to generate code snippets, boil-

erplate code, stub implementations, and much more.

It can also assist with debugging by providing relevant

prompts and code snippets.

GitHub Copilot, developed by GitHub and Ope-

nAI, is an AI-powered code completion tool that sug-

gests code snippets and auto-completes code while

developers write. It utilizes OpenAI’s GPT (Gener-

ative Pre-trained Transformer) model to analyze the

code being written, along with contextual information

such as function names, comments, and documenta-

tion, to provide helpful suggestions. The tool aims

to enhance coding speed and efﬁciency by reducing

the need for manual typing. It has been trained on a

vast amount of code from public GitHub repositories.

Nizamudeen, F., Gatti, L., Bouali, N. and Ahmed, F.

Investigating the Impact of Code Generation Tools (ChatGPT & Github CoPilot) on Programming Education.

DOI: 10.5220/0012628000003693

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 16th International Conference on Computer Supported Education (CSEDU 2024) - Volume 2, pages 221-229

ISBN: 978-989-758-697-2; ISSN: 2184-5026

221

Students who apply for a GitHub Education Pack can

access GitHub Copilot for free.

ChatGPT and Github Copilot are the primary

technologies that we will focus on throughout this pa-

per. We will delve into the impacts that these code-

generation tools can have on education as a whole

from the perspective of a student and educators. We

analyze this impact by answering two main questions

a) What do novice programmers say about the inﬂu-

ence of code generation tools on their engagement

and motivation in the programming learning process?

b) To what extent can AI Tools assist students when

completing programming assignments and how can

teachers design exercises that still maintain relevant

learning objectives?

The paper is organized as follows: Section 2 will

provide the necessary background for this research.

Section 3 describes the methodology adopted to con-

duct this research. In Section 4 we summarise our

surveys and interviews. Section 5 is dedicated to ex-

ploring the capabilities of these tools. In Section 6 we

provide the results of our study. We conclude this pa-

per in Sections 7 & 8 touching upon the limitations

of this study and providing concluding thoughts.

2 EXISTING WORK

In this section, we will brieﬂy describe some of the

existing work on the domain.

2.1 Impact on Educators/Students

Several studies have emphasized the potential of AI-

generated code technologies to support teachers in au-

tomating exercise generation and providing explana-

tions of the code.

(Sarsa et al., 2022) and (Marwan et al., 2019) have

speciﬁcally highlighted the beneﬁts of these tools

in assisting educators in creating a wide range of

programming exercises and offering detailed expla-

nations, thereby enhancing student learning experi-

ences. Similarly (Becker et al., 2022) strengthens the

idea of using these tools as an aid for effective teach-

ing. They also raise concerns about the readiness of

teachers to handle the signiﬁcant inﬂuence on educa-

tional practices. The researchers therefore emphasize

the need for an urgent review and modiﬁcation of in-

structional approaches and traditional practices in re-

sponse to the advancements in code generation tech-

nology.

From the perspective of students or novice pro-

grammers, code-generating tools can aid in solving

programming tasks (Wermelinger, 2023), resolving

bugs (Surameery and Shakor, 2023), assisting with

software engineering projects (Khmelevsky et al.,

2012), generating exercises with explanations and il-

lustrative examples of programming constructs and

algorithmic patterns, or mapping problems to solu-

tions (Becker et al., 2022).

Some studies identify the beneﬁts of using code-

generating technologies for students as improved per-

formance and learning. The works of (Marwan et al.,

2019) and (Ouyang et al., 2022) report increased pro-

ductivity, consistency in code quality, and reduced er-

rors when using those tools. A recent review of empir-

ical studies reports an improvement in academic per-

formance, online engagement, and participation when

introducing AI applications in the educational pro-

cess (Ouyang et al., 2022). Additionally, these tools

can assist students in generating exemplar solutions

for programming exercises and code reviews of solu-

tions. Researchers point out the variety of solutions

these tools can propose, allowing educators to intro-

duce students to the diversity of ways that a problem

can be solved.

(Dwivedi et al., 2023) in their paper delved into

how some choose to label these AI systems as ’high-

tech plagiarism’ tools while also acknowledging the

difﬁculty in detecting their use. Despite this, they be-

lieve that we should embrace the use of these systems

as they have the potential to signiﬁcantly beneﬁt and

enhance the learning experiences of the students.

On the other hand, identiﬁed drawbacks of code

generation tools usage include the possibility of stu-

dents having a lack of understanding of fundamental

programming concepts. Several studies discuss the is-

sue of bias, bad habits, and over-reliance. (Chen et al.,

2021) and (Becker et al., 2022) raise concerns about

its suitability for novices, given that public code is of-

ten contributed by professionals and may not reﬂect

the expected quality for beginners and the desires of

educators.

2.2 Capabilities of AI Tools

(Nguyen and Nadi, 2022) have evaluated the inﬂu-

ence of programming languages on the accuracy of

GitHub Copilot’s code suggestions. For instance, in

their study using GitHub Copilot on LeetCode prob-

lems, they found that the tool correctly solved 57% of

the problems in Java, while only managing to solve

27% of the problems in JavaScript.

(Yetistiren et al., 2022) carried out an interest-

ing study that revealed the signiﬁcant impact of input

parameters on the validity of the code generated by

GitHub Copilot. Their ﬁndings indicated that the va-

lidity of the generated code dropped from 79% to 27%

CSEDU 2024 - 16th International Conference on Computer Supported Education

222

when using dummy or sub-optimal functions and pa-

rameter names. (Dakhel et al., 2022) shared a similar

viewpoint, concluding that the quality of the gener-

ated code depends greatly on the clarity and depth of

the provided prompt. They also argued that GitHub

Copilot can be a valuable tool for expert programmers

but may present challenges for junior programmers

who are still learning coding conventions and a pro-

gramming language.

(Al Madi, 2022) also conducted research that fo-

cused on metrics related to the code generated by

GitHub Copilot. The results indicated that the com-

plexity and readability of code generated by GitHub

Copilot were comparable to code written entirely by

a human programmer.

(Wermelinger, 2023) argues that using GitHub

Copilot to solve programming exercises can be a frus-

trating trial-and-error process. They suggested that

while GitHub Copilot can provide an initial helpful at-

tempt to solve a problem, students still require a solid

understanding of a language’s semantics to modify

GitHub Copilot’s sometimes incorrect suggestions.

On the other hand, ChatGPT offers a distinct ap-

proach to assisting students. ChatGPT can assist with

a variety of tasks related to programming such as

debugging, code optimization, code completion, er-

ror ﬁxing, prediction, document generation, chatbot

development, text-to-code generation, and technical

query answering, as mentioned by (Biswas, 2023).

3 METHODOLOGY

The ﬁrst research question is to be answered via

the use of surveys and interviews with ﬁrst-year

Computer Science students. Interviews and surveys

were selected as the investigation methods, aiming to

gather ﬁrsthand experiences of novice programmers

using code generation tools during their studies.

We received 39 survey responses and randomly

chose 5 other ﬁrst-year students as the interviewees.

The students were randomly chosen and no prior

checks were done before selecting candidates, for ex-

ample, verifying their familiarity with these tools.

The questions asked will query the students on their

opinions and usage of these tools over the entire aca-

demic year. The interviews were conducted individu-

ally and in a closed setting.

Quantitative data from the survey was summa-

rized using descriptive statistical analysis. Qualitative

data from the interviews was analyzed using thematic

analysis to identify common themes and patterns. Fi-

nally, we compared the gathered information to ad-

dress the main research question, including a compar-

ison of research ﬁndings from the literature overview

and the outcomes of the conducted study.

The second research question was answered by

carrying out experiments over a set of assignments

currently used by a technical university in the Nether-

lands. Experiments were conducted using the latest

version of GitHub Copilot and the regular (free) ver-

sion of ChatGPT, which utilizes the GPT 3.5 model.

A diverse set of programming assignments

was identiﬁed to ensure representative results.

These assignments were selected from various ﬁrst

and second-year modules and encompass various

paradigms of coding such as imperative, functional,

and declarative styles.

Depending on the availability and ease of provid-

ing context, either ChatGPT or Copilot was utilized

to generate code. On occasion, we would use both

software to solve different parts of a large exercise.

If the generated code contained errors or the program

did not function as expected, ChatGPT was queried to

detect and resolve the issues.

4 STUDENT PERSPECTIVE ON

TOOL USAGE

In this section, we discuss how ﬁrst-year Computer

Science students are using code-generation tools and

their subjective experience of using them. This was

investigated with both a survey and an interview

with the goal of having a good understanding of the

viewpoint of students. We focus in particular on the

positive effects of these tools, and on the students’

awareness of their risks in an educational setting.

4.1 Survey

4.1.1 General Familiarity

All 39 respondents demonstrated familiarity with

ChatGPT, reﬂecting its recent popularity. ChatGPT

was reported as the most used tool, with 77% men-

tioning practical usage.

In the realm of code generation tools, GitHub

Copilot (69.2%), OpenAI Codex (30.8%), and Tab-

nine (15.4%) were widely known. DeepMind Al-

phaCode (7.7%), Microsoft CodeBERT, and Google

PaLM (5.1% each) were less recognized and not used

by survey participants. Notably, 15.4% had prior ex-

perience with code generation tools before starting

Due to space constraints, the list of survey and inter-

view questions is not included in this paper, but is available

at https://drive.google.com/ﬁle/d/1nu6vdSpnKnwdRhVaI-

3Z1YySXIoeefH7/view?usp=sharing.

Investigating the Impact of Code Generation Tools (ChatGPT & Github CoPilot) on Programming Education

223

their university Computer Science course, and half of

them found ChatGPT most useful.

Only 6 out of 39 people (15.4%) tried code gen-

eration tools before starting their university computer

science course. Among these individuals, two were

quite familiar with code generation tools, and only

one always used them in coding practices. Half of the

students with prior experience in such tools marked

ChatGPT as the most useful one.

4.1.2 First Year Computer Science Course

Out of all 39 responses, 21 (53.8%) have used code

generation tools during their ﬁrst year of the Com-

puter Science course. Among these respondents,

ChatGPT (GPT-4) was the most popular, with all re-

spondents using it during their ﬁrst academic year.

GitHub Copilot is second on the list, used by 7 people

(33.3%), followed by Tabnine (23.8%).

Concerning code generation tools speciﬁcally

used within students’ coursework, ChatGPT still

leads with 66.6% of students utilizing it for study-

related tasks. The second most popular response is us-

ing none of the tools (28.6%). Copilot takes the third

place with (23.8%), while Codex and Tabnine each re-

ceived only 1 vote (4.8%). Among respondents, 9.5%

of people have used code generation tools frequently,

and 61.9% used them at least a few times.

The next survey question revealed the speciﬁc pro-

gramming tasks students used these tools for. The ma-

jority of votes went to project work and exam prepara-

tion (57.1% each), followed by assignments (47.6%),

labs (33.3%), and group work (23.8%). Three people

(14.2% of all respondents) reported not using any tool

for any type of coursework.

A majority of students (76.2%) who completed

the survey said that the use of code generation tools

in their practices makes learning programming fun-

damentals and concepts less difﬁcult. A signiﬁcant

number of people (47.6%) feel that such tools also

increase their ability and skills in problem-solving for

programming tasks introduced by their Computer Sci-

ence program. A similar percentage of people feel

that code-generation tools make learning more enjoy-

able.

4.1.3 Motivation, Engagement, Conﬁdence

Regarding the increase in motivation with the use of

code-generation tools, the answers are a bit ambigu-

ous: one-third of respondents agree that the usage

of code-generation tools increases motivation towards

learning, another third remain neutral, and the last

third either completely (9.5%) or partially disagrees

(23.8%). Results from the question about the increase

in student engagement were slightly different: the

majority of respondents (38.1%) indicated agreement

with the statement, 4.8% strongly agreed, 33.3% re-

mained neutral, and the remaining 28.6% either com-

pletely disagreed or partially disagreed.

The survey questions about the ability to write

code, in general, showed that respondents feel less

conﬁdent when using code-generation tools. It is im-

portant to mention that for students who are initially

conﬁdent in their programming abilities, involving

code generation tools does not make a signiﬁcant dif-

ference. However, for individuals who generally lack

conﬁdence in programming, it worsens when involv-

ing code generation. Thus, we observe 4.8% of peo-

ple disagreeing in the ﬁrst case, increasing to 19%

in the second case when involving code generation.

In the end, only 26.8% of respondents feel that code

generation tools positively affect their conﬁdence in

programming abilities.

4.2 Interviews

Interviews were conducted with ﬁve randomly se-

lected ﬁrst-year Computer Science students as men-

tioned earlier. The responses revealed that all stu-

dents are aware of code-generation tools and use them

frequently for various learning purposes. Our explo-

ration of their initial expectations and impressions of

the use of such technologies showed that, in most

cases, expectations were not high, but the results were

quite impressive. Students did not anticipate code

generation tools to be as intelligent and did not be-

lieve they would be as helpful.

Regarding their experience with code generation

tools in the ﬁrst year of the Computer Science course,

all respondents stated that these tools have been ex-

cellent learning assistants and have been extensively

used for coding assignments, projects, exam prepara-

tion, understanding the theoretical part of the course,

and personal coding practices. Students reported that

these tools are of great help when needing to create

easy methods, quick functionality or logic, build com-

ponents, or ﬁnd minor bugs.

The students were also queried about any beneﬁts

or pitfalls related to their learning process when using

code-generation tools. Among the advantages men-

tioned are reduced time spent on problems or bugs,

as code generation tools can detect issues in the code,

and improved overall learning as they effectively ex-

plain how certain things in the code work. Another

positive aspect is that these tools make it easier to

understand certain topics, parts of code, methods, or

classes. Students mentioned that such tools provide a

fresh perspective towards approaching problems. In

CSEDU 2024 - 16th International Conference on Computer Supported Education

224

general, students consider code-generation tools to be

a great supplement to learning.

Regarding disadvantages, the main one mentioned

by students is the possibility of over-reliance on

the tool. Some concerns expressed by students in-

clude that it “could be bad for lazy people” or “can

make people lazy.” However, all interviewees who ex-

pressed concerns acknowledged that this also depends

on the individual and their intentions for using such

tools.

In terms of impact on the student’s ability to de-

velop programming proﬁciency, all respondents assert

that code-generation tools are particularly helpful, es-

pecially for basic tasks. According to students, they

increase efﬁciency, reduce debugging and problem-

solving time, and aid in understanding concepts.

Some respondents mention that while code generation

tools “save” the brain from thinking, overuse can lead

to a slowed development of problem-solving skills

and understanding of topics.

Regarding the further integration of code gener-

ation tools into the educational realm, students’ re-

sponses suggest that such technologies should be in-

troduced. Additionally, not only should the function-

ality of these tools be understood, but also how to use

them most effectively should be studied. “It would

not be smart to ban [code generation tools]; for many,

it is a helping tool, not an answer tool,” as one re-

spondent states. Overall, they can bring numerous

advantages, including the improvement of students’

engagement in the learning process. This can occur

“if all students will understand the real goal of [code

generation tools] and use them in a smart way.”

5 CAPABILITIES OF AI TOOLS

This section provides a brief analysis of how Chat-

GPT and CoPilot perform against the coding assign-

ments of our ﬁrst and second-year courses. We’ll ex-

plore their capabilities, strengths, and limitations in

handling various coding and querying tasks, ranging

from basic syntax and logic to more complex algorith-

mic problems. This assessment aims to shed light on

the practical utility of these tools in an educational

setting, offering insights into their potential role in

shaping the learning curve for students.

5.1 Assembly/ Low Level Coding

These were practical exercises closely related to

Computer hardware. Speciﬁcally, the students

worked with an Arduino and wrote assembly code for

it. For certain exercises, They also were required to

convert these instructions into the hexadecimal format

and supply them accordingly.

Given that the problem solutions were generally

small in size, ChatGPT proved to be highly useful. It

competently solved the majority of the exercises pre-

sented. With knowledge of the AVR instruction set,

ChatGPT was not only capable of writing the com-

mands in assembly language but also correctly con-

verting them to hexadecimal format, adhering to the

format speciﬁed in the AVR instruction set. Addi-

tionally, it could extend speciﬁc code snippets already

provided to the students by adding the requested func-

tionality. ChatGPT could infer the appropriate regis-

ters to be used based on the sample code provided.

It possessed knowledge of all essential assembly in-

structions required for this practical, such as ”rjmp,”

”brne,” ”nop,” ”ldi,” and so on. Throughout these ex-

ercises, ChatGPT developed code that could:

• Turn the LED ’on’ and ’off’ on the Arduino.

• Send the contents of a register to the computer via

the Putty interface.

• Write a program capable of ﬂashing a Morse code

pattern for ”SOS” (this was the Weekly Assign-

ment).

However, it has certain limitations when it comes

to speciﬁc hardware components. For instance, while

it can write code that utilizes the LED integrated into

the Arduino board, it produces incorrect output when

working with an externally connected buzzer.

5.2 Algorithms

These exercises were dedicated to exploring tradi-

tional imperative programming. Speciﬁcally, the stu-

dents delved into the Python programming language

and implemented widely used algorithms such as Lin-

ear Search, Bubble Sort, and Merge Sort. Throughout

this process, they gained familiarity with fundamen-

tal concepts like variables, functions, and ﬁle reading,

as well as slightly more advanced concepts such as

recursion.

Given the simplicity of these exercises, ChatGPT

and Copilot proved to be valuable resources capable

of solving a majority of the exercises. The provided

manual often included pseudocode to guide the stu-

dents in understanding the expected algorithm struc-

ture and supplying this pseudocode to ChatGPT re-

sulted in algorithms that adhered closely to the de-

sired structure. While one could generate these al-

gorithms using Copilot within their integrated devel-

opment environment (IDE), the implementation may

exhibit slight variations.

Investigating the Impact of Code Generation Tools (ChatGPT & Github CoPilot) on Programming Education

225

5.3 Databases and SQL

These exercises focused on teaching students about

Database Management Systems. The students ex-

plored key theoretical concepts related to databases,

including foreign keys, information querying, and

creating and populating databases with data.

One type of exercise involves retrieving data from

the database using simple SQL queries. For this task,

ChatGPT proves to be the most helpful tool. Copi-

lot does not perform as well since it lacks the context

of the database structure, and providing it with this

information is not straightforward. Before querying

ChatGPT with SQL queries, it is necessary to provide

the database structure. In tables where ﬁeld IDs are

the same across multiple tables, the ChatGPT can de-

tect foreign key relationships directly, eliminating the

need for explicit speciﬁcation in such cases. ChatGPT

was able to generate queries for all exercises. These

exercises often involved joining multiple tables, ex-

tracting speciﬁc ﬁelds, and occasionally utilizing ag-

gregate functions and the GROUP BY clause.

5.4 Functional Programming

The focus of these exercises was on the paradigm of

functional programming, with an emphasis on work-

ing with Haskell and the GHCI compiler. The lab ex-

ercises consisted of small implementations for various

exercises.

ChatGPT proved to be a useful tool for solving

these problems, successfully tackling the majority of

exercises with minimal issues. It possesses knowl-

edge of advanced Haskell functions, such as higher-

order functions, Guards, and recursion. Furthermore,

it can utilize previously deﬁned functions from earlier

exercises, taking the context into account. ChatGPT

enforces appropriate typing for each function and is

familiar with QuickCheck. Some examples of exer-

cises it was able to solve include:

• Functions to calculate the extreme values of poly-

nomials.

• Function to determine whether a character is low-

ercase or uppercase.

• Deﬁning QuickCheck properties to ensure the im-

plemented functions work as expected.

• Function to sum the values in a list.

• Function to compute the sum of an arithmetic se-

quence.

• Function to count occurrences of a phrase in a sen-

tence.

• Function to ﬁnd all dividers of certain numbers

and a function to check if a given number is prime.

Our results are consistent with the ﬁndings of

(Geng et al., 2023), who conducted extensive re-

search on how ChatGPT performs with exercises in

the programming language OCaml. They organized

various groups of functional programming exercises,

each covering a different aspect of the language. For

more than half of these groups, ChatGPT was able

to solve 100% of the exercises, while for the remain-

ing groups, it maintained a high accuracy rate of over

90%.

The ﬁnal assignment for this topic involved

an interesting and unique case study that required

an understanding of additional mathematical theory

that was to be implemented via Haskell functions.

The necessary theory and questions were provided

through an external PDF ﬁle, and students were re-

quired to ﬁll in their answers in a ’.hs’ (Haskell) ﬁle.

This approach effectively countered the reliance on

automated tools. Solving these exercises required a

deep comprehension of the theory presented in the

documents, which included tables, graphs, and im-

ages. It was challenging to provide this information to

the AI models, making it difﬁcult to obtain appropri-

ate assistance from ChatGPT. While ChatGPT could

provide help with deﬁning simpler functions, the ma-

jority of the work and the thinking process had to be

done by the learner.

5.5 Advanced Imperative Programming

Advanced Imperative programming was taught to the

students over the span of 10 weeks in a software de-

velopment focused module. Java was utilized in this

module, and students will explore various topics, in-

cluding but not limited to:

• Core imperative concepts: variables, data types,

arrays, methods, and exceptions.

• Object-oriented programming: classes, inheri-

tance, and composition.

• Socket programming concepts: Server Sockets,

Sockets, Readers, and Writers.

• Implementation of applications using the Model-

View-Controller (MVC) pattern.

• Multithreading concepts: synchronization, race

conditions, and locks.

• Software documentation using Javadoc.

• Planning and implementation of unit and integra-

tion tests.

CSEDU 2024 - 16th International Conference on Computer Supported Education

226

In this traditional coding environment, the impact

of Copilot becomes evident. With a codebase that

was expanding every week, Copilot leverages context

inference from other ﬁles in the workspace to gain

a comprehensive understanding of the code written

thus far, enabling it to provide appropriate sugges-

tions. Supplying all this data to ChatGPT would be

exceedingly challenging. Copilot offers several ways

to assist students during these modules:

Code Suggestions. GitHub Copilot offers intelligent

code suggestions based on the context of Java code.

It can infer information from different classes in the

workspace, providing relevant code snippets, method

calls variable declarations, and more. Often, it sug-

gests solutions before they are even considered by

the coder. It encompasses knowledge of the con-

cepts taught in this module and provides accurate and

useful suggestions. As reported in (Yetistiren et al.,

2022; Dakhel et al., 2022), our research also showed

that using optimal method signatures and comments

greatly increases the likelihood that the code gener-

ated is what the user needs.

JavaDoc Generation. Students often needed to pro-

vide detailed documentation, including JavaDoc com-

ments, for their code. Copilot utilizes its understand-

ing of the codebase to automatically generate reason-

ably accurate and meaningful JavaDoc comments. It

considers method signatures and function implemen-

tations to generate relevant JavaDoc in most cases.

External Libraries: In addition to JavaDoc genera-

tion, Copilot can aid in other documentation-related

tasks. It suggests relevant code examples or explana-

tions for speciﬁc Java concepts, APIs, or design pat-

terns as well as usage of popular external libraries.

Testing Automation. Copilot’s code analysis capa-

bilities allow it to generate integration and unit tests.

It proposes test cases, test methods, and testing frame-

works tailored to the Java codebase.

Error Handling and Exception Management.

Copilot also helps identify potential error scenarios

and suggests appropriate exception-handling mecha-

nisms for the coder.

However, there are still potential use cases for

ChatGPT. When tackling isolated problems, it can be

valuable for debugging speciﬁc code sections. Ad-

ditionally, when facing difﬁculties in approaching a

problem, seeking guidance from ChatGPT for a start-

ing point can provide initial coding assistance, with

Copilot taking over to further aid the process.

Previous research by (Ouh et al., 2023) has ex-

plored the utilization of ChatGPT (without Copilot)

for solving programming exercises in Java. Their

experiments primarily focused on smaller, indepen-

dent exercises similar to those found on platforms like

LeetCode. They did not extensively cover the more

complex topics included in this particular course.

Nevertheless, their ﬁndings indicated that a major-

ity of the exercises in their Java programming course

could be successfully solved using ChatGPT.

5.6 Data Science with Jupyter

Notebooks

As part of an AI course, students were expected

to implement several data science exercises through

Jupyter Notebook ﬁles. Some of the goals of these

assignments include:

• Building a Neural Network

• Constructing a Decision Tree

• Building a classiﬁer considering certain features

Many of these exercises intertwine relevant math-

ematical and AI theory, making it slightly cumber-

some to directly convey all the theory to ChatGPT.

However, throughout these practical exercises, sev-

eral programming tasks still can be completed using

AI tools. Some of the simple mathematical theories

that require a declaration of simple functions such as

Entropy, Precision, and Information Gain can be gen-

erated using ChatGPT. Additionally, Most of the tasks

related to data manipulation and utilizing external li-

braries such as Numpy, Pandas, and SkLearn can also

be automated using Copilot. Some tasks that Copilot

helped with during these assignments include split-

ting data into training and test sets, deﬁning a func-

tion to measure the accuracy of models, initializing a

Decision Tree Classiﬁer with appropriate parameters,

and reading data from a CSV ﬁle.

6 RESULTS AND DISCUSSIONS

6.1 Viewpoint of Students

Common uses of these tools for students include as-

sistance when stuck on a problem, verifying answers,

debugging, explaining, or improving code, and expla-

nation of fundamentals. All respondents agreed that

these tools save time while programming.

Regarding disadvantages, many respondents cau-

tion against over-reliance on these tools. Moreover,

they mention that generated code is often hard to un-

derstand or lacks a clear goal, making its further us-

age challenging. Some students also report reduced

interest in becoming proﬁcient in fundamentals. Stu-

dents suggest using code generation wisely, either for

time-saving on trivial tasks or for personal code and

knowledge improvement.

Investigating the Impact of Code Generation Tools (ChatGPT & Github CoPilot) on Programming Education

227

6.2 Capabilities of AI Tools

When it comes to the capabilities of these tools. The

results indicate that a majority of the exercises can be

partially or completely solved with the assistance of

ChatGPT and Copilot. These tools are particularly

helpful for exercises related to:

• Assembly programming

• Haskell programming

• Web programming (in Java)

• SQL queries and data manipulation

• Algorithms in various languages

• Data science using Jupyter Notebooks

When the tools cannot provide solutions, they can

still serve as valuable starting points, providing much

of the necessary boilerplate and starting code.

Through this research, it also becomes apparent

that these software tools struggle to provide solutions

when they lack the appropriate context to approach a

question. Adding context that is challenging to supply

to these programs can reduce their effectiveness.

Some strategies to achieve this include:

• Requiring the utilization of custom ﬁles with un-

conventional methods as part of the exercise solu-

tions

• Referencing speciﬁc (unique) content from lec-

tures within the exercises

• Integrating additional theory, such as mathematics

or unique case studies, into the exercises

Furthermore, making it difﬁcult to provide con-

text can also act as a deterrent for students using these

tools. For example, in the SQL exercises that the stu-

dents worked on in our tests, they did not have to

explicitly specify any foreign key relations as Chat-

GPT could infer them automatically due to the iden-

tical ﬁeld names in different tables. However, if for-

eign key attributes had different names, users would

need to specify each foreign key relation individually,

potentially discouraging tool usage, especially when

there are numerous such relations involved. This is a

good example of designing exercises with the use of

these tools in mind.

7 LIMITATIONS AND FUTURE

WORK

Due to the multifaceted nature of the learning expe-

rience, which could not be fully studied within the

scope of this research, for the students’ perception we

had to focus on factors such as student engagement

and motivation. Consequently, the work can offer rel-

evant insights only in relation to those factors.

Additionally, the potential for bias arises as we

rely on data from surveys and interviews, where par-

ticipants might provide socially desirable or poten-

tially biased answers. Moreover, the analyzed data is

of limited sample size, speciﬁc time frame, and geo-

graphical focus, which may constrain the generaliza-

tion of our ﬁndings. These limitations should be kept

in mind when interpreting the results of this study, and

future research should aim to address them.

This research project also primarily focused on

smaller, bite-sized exercises when determining the ca-

pabilities of these tools. However, many program-

ming assignments often consist of larger projects

spanning the course of a few weeks. Testing these

AI models on larger and more diverse projects could

yield different results and provide differing valuable

insights.

8 CONCLUSIONS

In conclusion, code generation technologies are de-

veloping rapidly and it is hard to ignore their preva-

lence among aids for learning within Computer Sci-

ence programs. They have the potential to bring ben-

eﬁts in the educational realm if used smartly.

From our survey, we see that a majority of stu-

dents already make use of these tools and have a gen-

erally positive impression of their widespread adop-

tion. Most students feel that these tools make it easier

to grasp programming fundamentals as well as im-

prove the overall learning experience.

The analysis conducted in this report also ex-

plored how these AI tools can be utilized to assist stu-

dents in solving exercises. The ﬁndings demonstrate

that a signiﬁcant portion of the simple exercises can

be partially or completely solved with the aid of AI

tools. Additionally, this paper provided insights on

designing exercises that can limit the reliance on AI

tools.

REFERENCES

Al Madi, N. (2022). How Readable is Model-generated

Code? Examining Readability and Visual Inspec-

tion of GitHub Copilot. In Proceedings of the 37th

IEEE/ACM International Conference on Automated

Software Engineering, New York, NY, USA. ACM.

Becker, B. A., Denny, P., Finnie-Ansley, J., Luxton-Reilly,

A., Prather, J., and Santos, E. A. (2022). Programming

CSEDU 2024 - 16th International Conference on Computer Supported Education

228

Is Hard – Or at Least It Used to Be: Educational Op-

portunities And Challenges of AI Code Generation.

Biswas, S. (2023). Role of ChatGPT in Computer Program-

ming. Mesopotamian Journal of Computer Science.

Chen, M., Tworek, J., Jun, H., Yuan, Q., Pinto, H. P. d. O.,

Kaplan, J., Edwards, H., Burda, Y., Joseph, N., Brock-

man, G., Ray, A., Puri, R., Krueger, G., Petrov, M.,

Khlaaf, H., Sastry, G., Mishkin, P., Chan, B., Gray, S.,

Ryder, N., Pavlov, M., Power, A., Kaiser, L., Bavar-

ian, M., Winter, C., Tillet, P., Such, F. P., Cummings,

D., Plappert, M., Chantzis, F., Barnes, E., Herbert-

Voss, A., Guss, W. H., Nichol, A., Paino, A., Tezak,

N., Tang, J., Babuschkin, I., Balaji, S., Jain, S., Saun-

ders, W., Hesse, C., Carr, A. N., Leike, J., Achiam,

J., Misra, V., Morikawa, E., Radford, A., Knight, M.,

Brundage, M., Murati, M., Mayer, K., Welinder, P.,

McGrew, B., Amodei, D., McCandlish, S., Sutskever,

I., and Zaremba, W. (2021). Evaluating Large Lan-

guage Models Trained on Code.

Dakhel, A. M., Majdinasab, V., Nikanjam, A., Khomh, F.,

Desmarais, M. C., Ming, Z., and Jiang (2022). GitHub

Copilot AI pair programmer: Asset or Liability?

Dwivedi, Y. K., Kshetri, N., Hughes, L., Slade, E. L., Je-

yaraj, A., Kar, A. K., Baabdullah, A. M., Koohang, A.,

Raghavan, V., Ahuja, M., Albanna, H., Albashrawi,

M. A., Al-Busaidi, A. S., Balakrishnan, J., Barlette,

Y., Basu, S., Bose, I., Brooks, L., Buhalis, D., Carter,

L., Chowdhury, S., Crick, T., Cunningham, S. W.,

Davies, G. H., Davison, R. M., D

e, R., Dennehy, D.,

Duan, Y., Dubey, R., Dwivedi, R., Edwards, J. S.,

Flavi

an, C., Gauld, R., Grover, V., Hu, M.-C., Janssen,

M., Jones, P., Junglas, I., Khorana, S., Kraus, S.,

Larsen, K. R., Latreille, P., Laumer, S., Malik, F. T.,

Mardani, A., Mariani, M., Mithas, S., Mogaji, E.,

Nord, J. H., O’Connor, S., Okumus, F., Pagani, M.,

Pandey, N., Papagiannidis, S., Pappas, I. O., Pathak,

N., Pries-Heje, J., Raman, R., Rana, N. P., Rehm,

S.-V., Ribeiro-Navarrete, S., Richter, A., Rowe, F.,

Sarker, S., Stahl, B. C., Tiwari, M. K., van der Aalst,

W., Venkatesh, V., Viglia, G., Wade, M., Walton,

P., Wirtz, J., and Wright, R. (2023). Opinion Pa-

per: “So what if ChatGPT wrote it?” Multidisciplinary

perspectives on opportunities, challenges and impli-

cations of generative conversational AI for research,

practice and policy. International Journal of Informa-

tion Management, 71.

Geng, C., Zhang, Y., Pientka, B., and Si, X. (2023). Can

ChatGPT Pass An Introductory Level Functional Lan-

guage Programming Course?

Herrington, J. (2003). Code Generation in Action.

Khmelevsky, Y., Hains, G., and Li, C. (2012). Automatic

code generation within student’s software engineering

projects. In Proceedings of the Seventeenth Western

Canadian Conference on Computing Education, New

York, NY, USA. ACM.

Li, Y., Choi, D., Chung, J., Kushman, N., Schrittwieser,

J., Leblond, R., Eccles, T., Keeling, J., Gimeno,

F., Dal Lago, A., Hubert, T., Choy, P., de Mas-

son d’Autume, C., Babuschkin, I., Chen, X., Huang,

P.-S., Welbl, J., Gowal, S., Cherepanov, A., Molloy,

J., Mankowitz, D. J., Sutherland Robson, E., Kohli,

P., de Freitas, N., Kavukcuoglu, K., and Vinyals, O.

(2022). Competition-level code generation with Al-

phaCode. Science.

Marwan, S., Jay Williams, J., and Price, T. (2019). An

Evaluation of the Impact of Automated Programming

Hints on Performance and Learning. In Proceedings

of the 2019 ACM Conference on International Com-

puting Education Research, New York, NY, USA.

ACM.

Nguyen, N. and Nadi, S. (2022). An empirical evaluation of

GitHub copilot’s code suggestions. In Proceedings of

the 19th International Conference on Mining Software

Repositories, pages 1–5, New York, NY, USA. ACM.

Ouh, E. L., Gan, B. K. S., Jin Shim, K., and Wlodkowski,

S. (2023). ChatGPT, Can You Generate Solutions for

my Coding Exercises? An Evaluation on its Effective-

ness in an undergraduate Java Programming Course.

In Proceedings of the 2023 Conference on Innovation

and Technology in Computer Science Education V. 1,

New York, NY, USA. ACM.

Ouyang, F., Zheng, L., and Jiao, P. (2022). Artiﬁcial intelli-

gence in online higher education: A systematic review

of empirical research from 2011 to 2020. Education

and Information Technologies, 27(6):7893–7925.

Sarsa, S., Denny, P., Hellas, A., and Leinonen, J. (2022).

Automatic Generation of Programming Exercises and

Code Explanations using Large Language Models.

Surameery, N. M. S. and Shakor, M. Y. (2023). Use Chat

GPT to Solve Programming Bugs. International Jour-

nal of Information technology and Computer Engi-

neering, (31):17–22.

Wermelinger, M. (2023). Using GitHub Copilot to Solve

Simple Programming Problems. In Proceedings of the

54th ACM Technical Symposium on Computer Science

Education V. 1, New York, NY, USA. ACM.

Yetistiren, B., Ozsoy, I., and Tuzun, E. (2022). Assessing

the quality of GitHub copilot’s code generation. In

Proceedings of the 18th International Conference on

Predictive Models and Data Analytics in Software En-

gineering, New York, NY, USA. ACM.

Investigating the Impact of Code Generation Tools (ChatGPT & Github CoPilot) on Programming Education

229