Introduction

The current discourse on Machine Learning (ML) and Artificial Intelligence (AI) in education revolves around two key themes. First, there is a need to equip young individuals with comprehensive knowledge and understanding of ML and AI, which have become ubiquitous in society and industry. This includes aspects such as social implications, ethical considerations, practical applications, models, and engines [35]. Second, there is a growing interest in exploring the impact of ML and AI tools on teaching and learning, both for educators and their students. This paper focuses on the latter theme, specifically examining the implications of recently developed ML and AI tools for computing education, with a particular emphasis on novice text-based programmers and their ability to learn conceptual knowledge in computer science (CS). The objective is to investigate the potential effects of these tools on students' experiences of learning to program from a cognitive perspective.

Machine Learning refers to a broad range of computer systems that can learn new functions by generalizing from data [32]. Natural language engines, on the other hand, are machines trained on extensive bodies of text and programming code to generate new content and provide suggestions. The integration of natural language AI in programming has revolutionized the way programmers approach coding tasks. These AI agents, such as 'Codex' developed by OpenAI, can generate code from high-level task descriptions and even provide explanations for code snippets and programs. While Codex and its related applications, like the 'ChatGPT' chatbot, show promising potential, limited empirical research is available regarding their future role in education since they are relatively recent prototypes [21].

According to OpenAI, Codex is a versatile programming model applicable to various programming tasks, proficient in multiple languages including those commonly used for teaching beginners, like Python, JavaScript, and Ruby. Codex has also found utility in software applications, such as Github Co-Pilot, assisting professional developers in code reviews. However, there is a growing need to understand the implications of employing these tools in computer programming education [36], including concerns of potential misuse by students to complete assessments. Recent research has explored topics such as developing educational content and assessments with AI program generators [9] and investigating the effectiveness of co-creation with AI in programming education for fostering creativity and design skills [13]. While some studies have examined Codex, others have focused on tools like Copilot. For the purposes of this review, the term "AI program generators" encompasses both Codex and Copilot.

This review has several aims. Firstly, it aims to evaluate the opportunities and risks associated with instructional approaches to learning computer programming, with a focus on the implications of using AI tools. Secondly, it aims to discuss the potential challenges in assessing student performance when AI tools are involved in the learning process. Lastly, it seeks to assess the practical application of AI trained on public data in educational settings and examine the potential adverse effects that may arise. By addressing these objectives, the study aims to contribute to the existing literature and provide valuable insights into the implications of machine learning (ML) and artificial intelligence (AI) tools in the field of computer science education.

The current literature on AI tools in education is still in its early stages. Although there is a limited number of studies available specifically focusing on AI tools for computer programming, it is important to note that most of these studies have primarily concentrated on older populations rather than secondary or high school students. Little empirical research is available on novice programming instruction using AI tools to draw on.

Significant efforts have been made to enhance programming education, prompted by research that revealed the limited access to computer science education in both the US and the UK, particularly among low-income students, students of color, and young women [16]. It became evident that despite technology's pervasive influence in our daily lives, only those with significant wealth and privilege had access to computing education. The reintroduction of Computer Science into the national curriculum in England in 2014, the invention of the Raspberry Pi computer, the "CS for All" movement and the launch of Code.org's hour of code activities are just a few examples of initiatives aiming to teach computer science and programming in ways that are culturally responsive, engaging for diverse students and communities, and more inclusive.

However, computer programming is recognized as one of the most challenging aspects of learning computer science [15,18] and there has been a growing interest in teaching programming to young learners over the past decade. Researchers and educators have been dedicated to better understanding the difficulties that students face in programming. Research in this area covers a broad range of topics, including contextual barriers, poor perceptions of computer science, bugs, and misconceptions in program construction. One area of agreement among scholars is the high mental effort required by learners when they begin to learn programming [12,18,31]. A number of approaches have been explored by scholars that include individuals as well as social learning theories. Cognitive load theory provides a valuable framework for examining the challenges faced by novice programmers and offers opportunities for improving instructional approaches, and has been selected as the underpinning theory for this study to provide a focus for theorizing the impact of AI program generators on learning to program. Cognitive load theory builds on the suggestion that human memory has two distinct areas, short-term working memory and long-term memory. Working memory is limited, and humans give most of their cognitive resources to this activity. Over time, disparate information elements connect with current understanding into collections of related knowledge called schemas. Stresses build on working memory, mainly when presented with new learning [2]. Sweller's [14] research suggests that during a learning episode, there are two critical stresses or cognitive loads acting on the learner: (1) Intrinsic load relates to the complexity of the learning task and the learner's existing understanding, and (2) Extraneous load, the additional stress placed on the learner, brought about by external conditions in the learning environment, for example.


Learning to program is concept-rich, leading to cognitive overload, which requires novices to have a secure mental model of computation …


There are several ways an educator can impact students' cognitive load whilst learning to program. First, they might consider how to present information related to a learning task to avoid additional load. Second, using worked examples in the activity design [2] to help scaffold the learning. Another beneficial technique to reduce cognitive load could be collaborative, such as pair programming, which may distribute the cognitive load among learners [31]. Using these perspectives for teaching computer programming and reducing cognitive load emerging from the literature, the following teaching and learning theories have been selected to explore the impact AI tools may have on these approaches: (1) Program comprehension and Schulte's 'Block Model;' (2) Vygotsky's concepts of 'the zone of proximal development' (ZPD), and the 'more knowledgeable other' (MKO) [34]; and (3) Papert's constructionism [24].

Opportunities and Risks Concerning Instructional Approaches

• Program Comprehension and the Block Model

Learning to program is concept-rich, leading to cognitive overload, which requires novices to have a secure mental model of computation [12,31]. Program comprehension is essential in learning to program to overcome difficulties novices face connecting conceptual knowledge with programming practice [28]. The hypothesis presented asks if the use of AI generators in programming instruction aids learning through program comprehension.

There are several code comprehension pedagogical models. For example, Izu et al. [12] identified more than 60 activities to support novices in developing their program comprehension. They include the 'block model' developed by Schulte [27], describing how beginners understand a program through reading. The block model is helpful for understanding and categorizing aspects of program comprehension, which describes the core sections of understanding a program. The model is expressed as a table (Figure 1) covering three dimensions across four levels, where columns represent the aspects of program comprehension and rows the hierarchical levels (28).

The three dimensions of comprehension in the block model are text surface, program execution and function. A computer program is a piece of text made of letters, symbols, and numbers; this is called the text surface and is concerned with the grammar and syntax of a program. When a program is executed, it becomes dynamic and may behave in different ways depending on its inputs—also known as program execution. Finally, the purpose of the program is defined by its function [20]. The Block Model provides twelve zones of program comprehension in all. Sentance et al. [31] assert that if educators can devise activities or ask questions about a program that develops computational knowledge in each of the twelve zones of the Block Model, it would better support students' understanding of computer programs, reducing the cognitive load.

Research is required to understand and test whether AI program generators support or disrupt the block model as a method of instruction; however, their use in code comprehension could aid the three dimensions of the block model. For example, a grasp of the text surface dimension requires novices to discern the meaning of text with unfamiliar terms, structures, and syntax. A popular instructional strategy is identifying code aspects within the text, in some cases by creating a sorting activity, where learners need to construct the program using text snippets [23]. Copilot can produce programs quickly, providing novices with experience in different ways to solve a programming problem and, crucially, providing opportunities to identify pieces of code. The AI program generator can help novices make sense of the text by identifying examples of variables, conditions, and functions. Similarly, novices could investigate the effect of swapping two lines of code or introduce bugs to see how the AI program generator resolves them, developing mental models of program execution.

To explore the function dimension, educators could ask novices to explain or act out a program generated by the AI application. In addition, educators could instruct students to compare multiple programs developed by the AI application to solve a particular set of problems to find which are functionally equivalent. Finally, from experience using Copilot, it is clear that it has been trained in a way that does result in some inconsistencies in the code it has generated, further providing opportunities for learning by having to evaluate the code and interrogate it.

These instruction methods for learning to program that focus on reading code could lower cognitive load [31]. By acting out programs, students engage in a kinesthetic and embodied learning experience. They physically simulate the execution of code, step by step, which helps them develop a better understanding of how the program functions. This approach allows students to visualize the flow of instructions and identify potential errors or areas of improvement in their code. By physically enacting the program, students can experience a more intuitive grasp of programming concepts, leading to a reduction in cognitive load. By comparing AI-generated computer programs, students can examine and analyze the code generated by AI program generators, alongside their own code. This comparison allows them to gain insights into alternative approaches, different programming styles, and more efficient solutions. Through this process, students can identify patterns, best practices, and innovative techniques, which can enhance their own programming skills and reduce the cognitive effort required for problem-solving.

However, using an AI code generator for program comprehension may be superfluous to lowering the cognitive load of novices, as educators can achieve the same results through preparation and pedagogy without needing a computing device, for example through an unplugged activity [3]. Moreover, using an AI program generator only works if the programs returned are simplistic enough for novices to engage with and understand. This seems unlikely, as the training data set used to train AI program generators currently is derived from thousands of GitHub repositories of expert developer-generated code. Scaffolded worked examples that educators have carefully curated reduce cognitive load. Gujberova and Kalas [11] recommended a sequence of carefully graded learning activities where learners read and interpret each line of code gradually. Off-the-shelf code solutions provided by experts on the other hand, which have not been scaffolded for novices, may be more likely to exacerbate the cognitive load on novice programming by introducing misconceptions [31]. Similarly, In the early stages of re-introducing computer programming to the curriculum, in my experience, students would often use the website 'StackOverflow.com' to generate code. The website acts as a database of questions and answers for expert programmers. Asking for a generic and simple solution often results in several different responses, with different approaches, and verbose descriptions. In turn this adds to the cognitive load. Chatterjee et al. [5] found in their study of novice software engineers use of Stack overflow to find needed information, that only 27% of the code and 16–21% of the natural language text could be useful in helping them to read and apply the information to their programming problem. Careful scaffolded instruction is required therefore in reducing the cognitive load of creating program solutions both through reading and applying code, something that requires a skilled educator.

• AI as a More Knowledgeable Other

This section explores how students learn and apply computer science knowledge using AI program generators and ChatGPT. As well as, how practice might be informed by the work of Soviet psychologist and social constructivist Lev Vygotsky through the concepts of the zone of proximal development (ZPD) and the more knowledgeable other (MKO) [34].

The zone of proximal development (ZPD) is defined as: "the distance between the actual developmental level as determined by independent problem solving and the level of potential development as determined through problem-solving under adult guidance or in collaboration with more capable peers." [34] When a student is in the zone of proximal development for a particular task, assistance is provided through social interaction and dialogue between students and educators. In applying this concept, educators are encouraged to consider ensuring that a more knowledgeable other (MKO) is present whose skills or knowledge are beyond the student, then provide opportunities for interaction with the MKO to allow the learner to observe and practice skills [19]. For example, teachers support students by explaining concepts through responses to questions and modelling what they require students to do. Here teachers were explicitly using their own expertise as a "more knowledgeable other" (MKO) [34]. By continuously providing students with tasks in their Zone of Proximal Development (ZPD), their ability to independently perform a wider range of tasks can be consistently expanded. This approach ensures that students are consistently challenged at a cognitive level, enabling them to enhance their skills and confidently undertake tasks without the need for constant supervision [38]. An AI program generator could be a MKO for students learning computer science concepts and skills through social interaction, which therefore may lower the cognitive load as described by Zulu et al. [38]. In theory, this seems plausible, with tools like Copilot in their current incarnation. For example, being able to describe to the AI application what the novice wants to achieve through a kind of conversation, with the AI application outputting suggestions. It could be a dynamic similar to pair programming where observation and practice can make concrete, complex concepts of learning to program.


Research is required to understand and test whether AI program generators support or disrupt the block model as a method of instruction; however, their use in code comprehension could aid the three dimensions of the block model.


However, the AI program generator is not another human. It is trained on millions of lines of code shared by experts. If used for programming instruction in secondary schools, it may be more akin to a 'super knowledgeable other,' generating expert-level code and providing the solution without requiring any cognitive load. Avoiding the cognitive load in this way may not engender any learning. Furthermore, the AI will inherently lack an understanding of cultural practice and emotional engagement that is part of learning, and part of the sociocultural context of learning that Vygotsky emphasizes [34]. Additionally, AI program generators always try to directly answer the question posed, while a human that is attempting to support learning might use questioning or provide partial solutions in order to guide the student. For example, when a human is learning with another human, they are able to ask questions of one another and joke with each other in ways that are culturally-contextualized that, in turn, can support deeper learning or help them connect in new ways with the material. This may not be achievable when interacting with AI. Finally, a possible mindset that the AI application is always correct, combined with a lack of understanding of how it works, may lead to adverse learning outcomes and experiences [13] and disrupt the confidence of secondary school students. Research analysis by Webb et al. [35] suggests that designers of AI applications built on machine learning should include the capability for the system to explain its decisions to users. From an ethical and accountability point of view, this would be wise, but moreover, it could help dispel the myth that AI applications are like humans [17]. Since the launch of ChatGPT in November 2022, there are anecdotal reports that due to the question-and-answer nature, it is a more useful tool than Codex or Copilot as a MKO in generating programs. A user can ask ChatGTP why it chose to present a particular solution, and it will return a natural language answer. There are examples of ChatGPT acting as a tool to help a user debug their code (Figure 2). It checks the code, explains what the bug is, and provides a solution in order to fix it. If the explanation appears to be too complex for a user, ChatGPT can be instructed to explain concepts to a particular age group, and it will return an age-appropriate response. These are examples of modelling and dialogue between a novice and a MKO which could result in better attainment of knowledge about computer programming and reduce the cognitive load on learners.

Constructionism

A well-established approach to learning computer programming is through student-centered discovery and project-based learning, which involves students creating and implementing their ideas by integrating various knowledge [24]. In a study on coding activities based on constructionism, Papavlasopoulou et al. [22] discovered that students who exhibited high levels of engagement and motivation while constructing their artefacts demonstrated gaze behavior that indicated reduced cognitive overload. The coding process requires significant effort and presents challenges, necessitating intrinsic motivation and perseverance from children as they engage in learning. Hence, the design of programming activities plays a critical role in novices' cognitive load. If an activity is too complex, students may experience cognitive overload, which can negatively impact their attitudes towards learning to program and lead to difficulties [22]. Papert [24] describes a qualitative problem-solving approach where individuals continuously experiment and refine their artefacts until they are completed, which if implemented and supported by educators, can reduce cognitive load. Implementing this approach can broaden skills such as collaboration, design, and prototyping, as well as being an effective way to create an engaging learning environment [30]. Resnick [26] argues that while solving puzzles and problems may help develop the cognitive processes to learn to program, creating projects takes learners further, developing their voice and identity and helping students develop as creative thinkers. Using an AI code generator and a constructionist approach, students can theoretically be more experimental in their learning than starting from a blank slate, argues Resnick [26] echoing Papert's vision that young learners should be seen as 'active constructors' rather than 'passive recipients' of knowledge. Using worked examples or chunks of code to construct programs to achieve their desired outcome can empower novices, keep them engaged in learning and reduce the cognitive load by removing difficulties concerning the text surface, like using the correct syntax.

A consistent risk for learning described by studies that use constructionist approaches is student resilience. As described earlier, programming requires high mental effort, even if the experience is positive and engaging (31). Interviews with students participating in physical computing activities show that they found the tasks hard, describing it as "a steep learning curve" [30, p. 6]. Could the use of AI program generators promote creativity and construct knowledge by taking care of the blockers of programming, keeping students engaged, which is a precursor to reducing cognitive load [22]? Jonsson and Tholander [13] found that participants in their study approached the AI application as a tool to speed up the process of programming, whilst others treated it in a more open-ended fashion. The research showed that the generative and open-ended approach created a collaborative and conversational process between the participant and the AI application. Codex, in effect, worked as a co-participant in the creative design process [13]. This evidence would suggest that using an AI program generator makes learning to program more enjoyable and a way to avoid being blocked by text surface described in the block model. It also relates to Vygotsky's social learning theory, with the AI acting as a MKO. This approach also better reflects how expert programmers work. The creators of Codex, OpenAI [21], suggest that professionals who write computer code often break problems down into smaller problems and then find existing code libraries or APIs to use to solve those problems. It is likely that the cognitive load of writing computer code, even at the expert developer level, is high. OpenAI believes this activity is a high barrier to entry into the profession, as it is time-consuming and not a particularly creative endeavor. Writing code for novices in secondary school can also be time-consuming. However, project-based learning coding activities based on constructionism may offer students more creative agency and better learning outcomes. Resnick [26] suggests that as students work on projects, they gain an understanding and experience of the creative process, whilst also connecting with their own interests.


[Resnick] argues that while solving puzzles and problems may help develop the cognitive processes to learn to program, creating projects takes learners further, developing their voice and identity and helping students develop as creative thinkers.


Working independently on programming has been suggested to have a higher cognitive load than working collaboratively through pair programming [31]. Thinking of the AI program generator or ChatGPT as a collaborator in conversation in a tangible creative project could support novices in their learning. Creativity can often be limited by an individual's access to knowledge or materials. However, by working alongside AI applications, creativity can be promoted by removing the limitations and through the joint actions of participants and the system [13].

Using AI program generators through a project-based and more open-ended approach may also create new challenges that impact a novice's learning to program. Allowing the AI to do the heavy lifting in constructing a program requires the human collaborator, our novice programmer, to instruct the system to achieve the desired results. The study presented by Jonsson and Tholander [13] found that participants had to put significant effort into formulating their query for Codex to generate valid code. Therefore, younger learners and novices may not yet have developed a particular "economy of language" [13] to speak to the AI program generator. This "economy of language" may negatively impact the learning experience and be a barrier to developing the concepts and skills of programming a system. Again, ChatGPT has the potential to overcome this challenge as it can be instructed to provide age-appropriate responses. Similarly, to web searching and research skills that have been added to curriculums over the past several decades, techniques to get the most from AI program generators could be taught in future courses.

Opportunities and Risks Regarding Assessment

AI program generators such as Codex and Copilot may aid computer science education. However, many educators will likely have more severe and immediate concerns with these tools in the hands of their students [9] relating to plagiarism and cheating in summative assessments. A common way to assess computer programming concepts and skills is through lab-based tasks and unit tests where software "judges" programs submitted by users, by compiling, executing, and evaluating the code. It is also common practice for software engineering job interviews. The following sections look at the implications for student integrity and educator approaches to assessment by first summarizing the research relating to plagiarism with AI, the moral and ethical implications that may arise from its use, and the possible over-reliance on a tool that may have been trained with insufficient data.

• Plagiarism and Cheating

Academics cite plagiarism as having a higher occurrence in computer science than in many other disciplines, with an estimated 80% of computing students involved [9]. AI program generators such as Codex and, more recently, Copilot are freely available for students and educators [10] and, hypothetically, are already being used by CS students at the undergraduate level. Finnie-Ansley et al. [9] assessed the accuracy of the Codex AI engine when applied to introductory programming tests used at the author's institution, the University of Auckland, New Zealand, to assess undergraduates in the early stages of their CS course. The "results show that Codex performs better than most students on code-writing questions in typical first-year programming exams" and "solutions generated by Codex appear to include quite a lot of variation, which is likely to make it difficult for instructors to detect" [9, p.16]. Therefore, to mitigate this risk, educators need training in basic AI and ML literacies and to rethink how these kinds of items can be used in assessment, before deciding on how they teach and assess learning using AI and ML tools. A description of basic AI and ML literacies that could be incorporated into teacher professional development is suggested by Webb et al. [35]. They include understanding: the machine learning processes that support learning, how to act responsibly within society, and critically assessing the ethical implications that AI systems trained on data raise. Yang [36] argues that ideas concerning the ethics of AI can be developed by participating in learning activities as part of an AI and ML literacy curriculum. However, Tedre et al. [32] claim that participation is insufficient and that educators must rethink conceptual knowledge through a new ML and AI paradigm.

• Accountability for Learning Achievements

AI applications trained on open data may pose accountability challenges for learning achievements [35]. For example, it may need to be clarified in advance who is driving the decision-making in formulating a computer program and taking responsibility for those decisions. Webb et al. [35] highlight the differences between artificial intelligence decision-making and human decision-making, particularly from a moral and ethical point of view. Measuring student achievement of CS concepts through programming exercises and deciphering accountability for aspects of the program may be problematic and complex for educators. Human decision-making involves a complex set of judgements and feelings relating to the context or scenario in front of them. AI decision-making, based on ML, does not have the same flexibility or empathy.

Intellectual property rights pose another consideration for use in secondary school contexts. For example, tools trained on open data may violate the legal rights of creators. Codex is trained on code from a large open-source repository [10]. The code creator or contributor has not necessarily provided permission under the terms of an open license for their code to be used to train an AI engine or be used to create new programs. A class-action lawsuit has been filed against GitHub, Microsoft, and OpenAI in US federal court on behalf of millions of GitHub users on 17 October 2022. The filed suit claims to be the first class-action case in the US challenging the training and output of AI systems [4], which is likely to have far-reaching implications for the development and use of AI technologies. It may take some time for a judgement to come. In the meantime, if secondary school students and educators are using AI program generators to learn to program, then consideration should be given to who owns the code, if its use is permissible under open-source licensing, and who is accountable for the source of the code output.

Consider accountability in terms of assessment, where students may use artificial intelligence applications collaboratively to create an output. For example, how can an educator set marking criteria, such as assessment rubrics for collaborative projects with an AI application? However, the same could be asked of two students collaborating. Resnick [26] argues that education assesses students' progress based on evidence, typically expressed as numbers and statistics. Instead, education should also value documenting what students learn by encouraging reflective practice illustrating what they have created, what others have input, how they created it and why. This reflective and evidence-based approach may support learning and mitigate student achievement accountability if AI program generators are used in creating computer programs, however recent AI applications such as ChatGPT are able to provide text that students could use to provide evaluation and reflection for the purposes of assessment. This is akin to Papert's [24] qualitative problem-solving approach where individuals continuously experiment and refine their artefacts until they are completed, which reduces the cognitive load of a learner and increases their motivation. However, this shortcut to complete a task that is designed to take time, and build on the experiences of the learning, may be detrimental to the learning experience described by Resnick [26].

• Accuracy of Outputted Code

Another consideration for novice programmers using AI program generators is a possible over-reliance on the output code [6]. Codex, for example, is trained on crowd-sourced, publicly available code generated by humans and posted to the website Github. Therefore, solutions developed by the AI application may use poor style, tackle programming problems in a sub-optimum way for building mental models of computation, and introduce misconceptions that prevail. Finne-Ansley et al. [9] compared Codex in the hands of students to a power tool in the hands of a novice. Whilst novices and students may not wish to cause harm using a tool, they may inadvertently do just that. On the other hand, knowing outputted code is likely to be inaccurate could aid learning by emphasizing the importance of reading code and applying a code comprehension approach which investigates code snippets [31].


Intellectual property rights pose another consideration for use in secondary school contexts. For example, tools trained on open data may violate the legal rights of creators.


Being a responsible citizen means having integrity. A social construct that implies that cheating on tests is a negative attribute. However, society may rethink what cheating is, with the introduction of AI applications and code generators. Furthermore, once aware of the risks AI applications pose, educators could think more about assessment methods and place more value on program comprehension and students' explanations of the code to demonstrate their computer science competency. However, more than basic AI literacy training may be required if educators and students already have constructed a mental model that an AI application is always correct [13].

Risks Concerning Safeguarding the Use of AI Program Generators

Selwyn [29] argues that using AI products can result in social harm in educational contexts, especially for minority groups, because AI models are trained on already discriminatory data, further amplifying social harm. Facial recognition systems, for example, regularly fail to recognize students of color [29]. This idea is also shared by Chen et al.'s [6] evaluation of Codex, which found that code is generated with a structure that reflects stereotypes about gender, race, class, and other protected characteristics. Therefore, its use in secondary CS education could exacerbate inequality experienced by marginalized groups and individuals. More concerning was Chen et al.'s [6] discovery that Codex "can be prompted in ways that generate racist, denigratory, and otherwise harmful outputs as code comments, meriting interventions." In educational contexts, it is the responsibility of schools to safeguard young and vulnerable individuals; therefore, a tool that can generate harmful comments could not and should not be used. Additionally, a student experiencing discrimination would feel discouraged to learn, which directly impacts on their cognitive load, underpinned by engagement and intrinsic motivation [22].

Nevertheless, AI applications are still learning, and products like AI code generators are still evolving. For example, Copilot's creator Github has included filters to block language which could offend. They also provide a mechanism to report offensive outputs directly to them, as they state they are committed to addressing the challenge. Whether these interventions safeguard the young and marginalized groups need investigating.

Initial evaluations of AI products trained with code have additionally found that the programs produced are only sometimes reliable or accurate. Furthermore, they can even suggest insecure code [6]. Therefore, their use in education systems could present challenges for infrastructure if a program is executed that causes damage to computer systems and networks, although this risk is also present if relying on code found from the internet or code written directly by learners. Chen et al. [6] suggest that "human oversight and vigilance is required for safe use of code generation systems." This assumption requires educators to have significant knowledge and experience, which may be different for schoolteachers with an emerging subject like CS. GitHub confirms that Copilot "may contain insecure coding patterns, bugs or references to outdated APIs or idioms." [10] However, GitHub suggests that combining a professional developer's knowledge and judgement with established testing and code review practices alongside security tools should mitigate this risk. However, novice programmers and their instructors may have a different understanding, training, or experience to know it is unsafe. Again, cognitive load of novice programmers may increase, if code output by an AI is continually incorrect, lowering a student's engagement and motivation.

Although research shows that output code from AI code generators is often inaccurate, unreliable, and can cause social harm and safety concerns [6], it does not seem to deter professional programmers and software engineers from using them. One user is quoted on the landing webpage for GitHub's Copilot, saying it "works shockingly well. I will never develop software without it again." [10] Studying the impact on professionals would be an interesting comparison for research. How are mental models of computation affected, if at all? How do experts converse with AI applications to achieve a desired result? What could be learned from a professional's approach that could support learners?

A mitigation for the risks that AI presents to the wellbeing of a student has begun by engaging young people in discussions relating to AI. Incorporating children's perspectives and values is crucial for the advancement and implementation of AI technology. Their insights can inform ethical practices and ensure that AI development aligns with their needs and values [1,35]. UNICEF [33] has created a set of nine principles that prioritize children's perspectives and have established the basis for child-centered artificial intelligence (AI). This approach places children's voices at the forefront and emphasizes their active participation in all stages of the AI lifecycle. Child-centered AI is not solely focused on minimizing harm but also aims to foster innovative and beneficial approaches in the design, development, and implementation of AI technologies [8].

Firstly, the cognitive load of students learning to program has the potential to be reduced by utilizing AI programming tools alongside well-established teaching practices such as code reading and comprehension, social learning, and constructionism. While this study focuses its analysis specifically on cognitive load of novice programmers in the act of learning, it is hard not to also ask the question about what exactly students learn when using AI program generators? Are they simply using how to navigate an AI tool, or are they learning the critical thinking skills of problem solving and key aspects of algorithmic thinking as well? While cognitive load may be decreased, what are students ultimately gaining intellectually and academically when using AI program generators? Additionally, does the use of AI tools diminish the value of computing as a subject for further study by students? These themes would be important to explore in greater depth in future research to better understand the impact on novice programmers' ability to learn conceptual knowledge of CS as well as their level of engagement.


The discourse on AI program generators should swiftly transition into policy and practice. Educators and policymakers must raise awareness of the potential impact and establish supportive practices to guide teaching and assessment.


It is evident that AI program generators were not originally developed with educational contexts in mind. They primarily aimed to support software developers in tackling complex tasks with the assistance of an intelligent collaborator. However, given the launch of tools like ChatGPT and the recent introduction of Ghostwriter AI by Repl.it [25], the impact on education is becoming increasingly apparent. Secondary school educators face significant challenges in terms of instructional practices, assessment methods, and student safeguarding across various subjects. Developers of educational AI code generators should create dedicated versions trained on data curated by educators themselves. This would provide educators with more confidence in the quality of the generated code, ensuring it aligns with the development of mental models of computing. Additionally, the involvement of policyholders in formulating a 'Code of Conduct' for users and developers is recommended to establish accountability in the use of these tools in educational settings. Furthermore, ethical considerations should be prioritized by developers from the outset.

The discourse on AI program generators should swiftly transition into policy and practice. Educators and policymakers must raise awareness of the potential impact and establish supportive practices to guide teaching and assessment. Whilst the focus of this study has not been on developing understanding of AI within computing education contexts, the risks highlighted in this paper demonstrate that issues of ethics and social responsibility are increasingly important. Basic AI literacy should become an essential component of educators' professional development, allowing them to stay informed about new developments, ensure student safety, understand accountability, and enhance their pedagogical practices. Curriculum content should also be expanded to teach basic AI literacy in schools to better prepare young people for work.

Acknowledgements

The author would like to acknowledge the Centre for Research in Education in Science Technology, Engineering & Mathematics (CRESTEM) at King's College London, which supported the author in preparing and submitting this article.

References

1. Aitken, M. and Briggs, M. In AI, data science, and young people. Understanding computing education (Vol 3). Proceedings of the Raspberry Pi Foundation Research Seminars, (2022). http://rpf.io/seminar-proceedings-vol-3-aitken-briggs; accessed 2023 Jan 10.

2. Bannert, M. (2002). Managing cognitive load—recent trends in cognitive load theory, Learning and Instruction, 12, 1 (2002), 139–146; https://doi.org/10.1016/S0959-4752(01)00021-4.

3. Bell, T., Alexander, J., Freeman, I., and Grimley, M. Computer science unplugged: School students doing real computing without computers. The New Zealand Journal of Applied Computing and Information Technology, 13, 1 (2009), 20–29.

4. Butterick, M. and Joseph Saveri Law Firm (2022). 'GitHub Copilot litigation' https://githubcopilotlitigation.com; accessed 2022 Nov 30.

5. Chatterjee, P., Kong, M. and Pollock, L. Finding help with programming errors: An exploratory study of novice software engineers' focus in stack overflow posts, Journal of Systems and Software, 159, 110454 (2020); https://doi.org/10.1016/j.jss.2019.110454.

6. Chen, M. et al. Evaluating Large Language Models Trained on Code, arXiv, 2021 http://arxiv.org/abs/2107.03374; accessed 2022 Oct 25.

7. Cooper, G. Examining science education in chatgpt: An exploratory study of generative artificial intelligence. Journal of Science Education and Technology, 32, 3 (2023), 444–452; https://doi.org/10.1007/s10956-023-10039-y.

8. Data Protection Working Party (2009, February 11). Opinion 2/2009 on the protection of children's personal data (General Guidelines and the special case of schools). https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2009/wp160_en.pdf; accessed 2023 Jan 10.

9. Finnie-Ansley, J., Denny, P., Becker, B.A., Luxton-Reilly, A. and Prather, J. (2022). The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming, Australasian Computing Education Conference. ACE '22: Australasian Computing Education Conference, Virtual Event Australia: ACM, 10–19; https://doi.org/10.1145/3511861.3511863.

10. GitHub Copilot. (2022, August). Your AI pair programmer, https://github.com/features/copilot; accessed 21 Nov 2022.

11. Gujberova, M., and Kalas, I. Designing productive gradations of tasks in primary programming education, in Proceedings of the 8th Workshop in Primary and Secondary Computing Education, WiPSE '13, (New York, NY, USA, ACM, 2013) 108–117; https://doi.org/10.1145/2532748.2532750.

12. Izu, Cruz, and Schulte, Carsten and Aggarwal, Ashish and Cutts, Quintin and Duran, Rodrigo and Gutica, Mirela and Heinemann, Birte and Kraemer, Eileen and Lonati, Violetta and Mirolo, Claudio and Weeda, Renske. Fostering Program Comprehension in Novice Programmers - Learning Activities and Learning Trajectories, Proceedings of the Working Group Reports on Innovation and Technology in Computer Science Education. /TiCSE '19: Innovation and Technology in Computer Science Education, (Aberdeen Scotland UK: ACM, 2019), 27–52. https://doi.org/10.1145/3344429.3372501. Scientific Figure on ResearchGate; https://www.researchgate.net/figure/The-Block-Model-Matrix_fig1_339040166; accessed 2023 Jan 13.

13. Jonsson, M. and Tholander, J. Cracking the code: Co-coding with AI in creative programming education, in Creativity and Cognition. C&C '22: Creativity and Cognition, (Venice Italy, ACM, 2022), 5–14; https://doi.org/10.1145/3527927.3532801.

14. Kirschner, P.A., Sweller, J. and Clark, R.E. (2006). Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructive Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching, Educational Psychologist, 41 (2), pp. 75–86; https://doi.org/10.1207/s15326985ep4102.

15. Kuljis, J., and Baldwin, L. P. (2000). Visualisation techniques for learning and teaching programming. Journal of Computing and Information Technology, 8, 4 (2000), 285–291; https://doi.org/10.2498/cit.2000.04.03.

16. Margolis, J. Stuck in the Shallow End, updated edition: Education, Race, and Computing. MIT Press (2017).

17. Marx, E., Leonhardt, T. and Bergner, N. Brief Summary of Existing Research on Students' Conceptions of AI, Proceedings of the 17th Workshop in Primary and Secondary Computing Education. WiPSCE '22: The 17th Workshop in Primary and Secondary Computing Education, (Morschach Switzerland: ACM, 2022), 1–2; https://doi.org/10.1145/3556787.3556872.

18. Mayer, R.E. The psychology of how novices learn computer programming. ACM Computing Surveys (CSUR), 13, 1 (1981), 121–141; https://doi.org/10.1145/356835.356841.

19. McLeod, S.A. What Is the zone of proximal development? Simply Psychology (2019). www.simplypsychology.org/Zone-of-Proximal-Development.html; accessed 2022 Nov 4.

20. National Centre for Computing Education (NCCE) (2020). Pedagogy Quick Reads: Understanding program comprehension using the Block Model. National Centre for Computing Education & Raspberry Pi Foundation, https://ncce.io/qr12; accessed 2022 Nov 4.

21. OpenAI ChatGPT: Optimizing Language Models for Dialogue, OpenAI.com (2022, 30 November) https://openai.com/blog/chatgpt/; accessed 2023 Jan 2.

22. Papavlasopoulou, S., Giannakos, M.N., and Jaccheri, L. Exploring children's learning experience in constructionism-based coding activities through design-based research. Computers in Human Behavior, 99, (2019), 415–427; https://doi.org/10.1016/j.chb.2019.01.008.

23. Parsons, D., and Haden, P. Parson's programming puzzles: a fun and effective learning tool for first programming courses, Proceedings of the 8th Australasian Conference on Computing Education-Volume 52 (2006), 157–163.

24. Papert, S. Mindstorms: Children, Computers, and Powerful Ideas (2nd Ed. 1993). Basics Books, A Member of The Perseus Books Group.

25. Repl.it Ghostwriter AI FAQ. Repl.it.com (2022, 7 November) https://docs.replit.com/ghostwriter/faq; Accessed 2022 Nov 9.

26. Resnick, M. Lifelong Kindergarten: Cultivating Creativity through Projects, Passion, Peers and Play. MIT Press, 2017.

27. Schulte, C. Block Model: an educational model of program comprehension as a tool for a scholarly approach to teaching, Proceedings of the 4th International Workshop on Computing Education Research - ICER '08. Proceeding of the 4th International Workshop, (Sydney, Australia: ACM Press, 2008), (Sydney, Australia: ACM Press, 2008), 149–160. https://doi.org/10.1145/1404520.1404535.

28. Schulte, C., Clear, T., Taherkhani, A., Busjahn, T. and Paterson, J.H. (2010). An introduction to program comprehension for computer science educators, Proceedings of the 2010 ITiCSE working group reports on Working group reports - ITiCSE-WGR '10. The 2010 ITiCSE working group reports (Ankara, Turkey, ACM Press), 65; https://doi.org/10.1145/1971681.1971687.

29. Selwyn, N. (2022). The future of AI and education: Some cautionary notes. European Journal of Education, 2022, pp. 1–12; https://doi.org/10.1111/ejed.12532.

30. Sentance, S. and Schwiderski-Grosche, S. Challenge and creativity: using .NET gadgeteer in schools, Proceedings of the 7th Workshop in Primary and Secondary Computing Education on - WiPSCE '12. The 7th Workshop in Primary and Secondary Computing Education, (Hamburg, Germany: ACM Press, 2012), 90; https://doi.org/10.1145/2481449.2481473.

31. Sentance, S., Waite, J. and Kallia, M. Teaching computer programming with PRIMM: a sociocultural perspective, Computer Science Education, 29, (2019), 2–3, 136–176; https://doi.org/10.1080/08993408.2019.1608781.

32. Tedre, M., Denning, P. and Toivonen, T. CT 2.0, 21st Koli Calling International Conference on Computing Education Research. Koli Calling '21: 21st Koli Calling International Conference on Computing Education Research, (Joensuu Finland: ACM, 2021), 1–8; https://doi.org/10.1145/3488042.3488053.

33. UNICEF (2020, September). Policy guidance on AI for children. UNICEF and the Ministry of Foreign Affairs of Finland. https://www.unicef.org/globalinsight/media/1171/file/UNICEF-Global-Insight-policyguidance-AI-children-draft-1.0-2020.pdf; accessed 2023 Jan 10.

34. Vygotsky, L. S., and Cole, M. Mind in Society: Development of higher psychological processes. Harvard University Press, (1978).

35. Webb, M.E., Fluck, A., Magenheim, J. et al. (2021). Machine learning for human learners: opportunities, issues, tensions and threats, Educational Technology Research and Development, 69 (4), 2109–2130; https://doi.org/10.1007/s11423-020-09858-2..

36. Yang, W. Artificial Intelligence education for young children: Why, what, and how in curriculum design and implementation, Computers and Education: Artificial Intelligence, 3 (2022), 100061; https://doi.org/10.1016/j.caeai.2022.100061.

37. Zaremba, Brockman, and Open AI (2021, August 10th). 'Open AI Codex', Open AI.com; https://openai.com/blog/openai-codex/; accessed 2021 Oct 13.

38. Zulu, E., Haupt, T., and Tramontin, V. Cognitive loading due to self-directed learning, complex questions and tasks in the zone of proximal development of students. Problems of Education in the 21st Century, 76, 6 (2018), 864; https://doi.org/0.33225/pec/18.76.864.

Author

Carrie Anne Philbin
School of Education, Communication & Policy
King's College London
Waterloo Road
London, SE1 9NH
[email protected]

Figures

F1Figure 1. The Block Model Matrix (Izu, Cruz and Schulte, et al. 2019)

F2Figure 2. ChatGPT identifying a bug in code along with a natural language description (https://chat.openai.com)

Copyright held by authors. Publication rights licensed to ACM.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2023 ACM, Inc.

Contents available in PDF
View Full Citation and Bibliometrics in the ACM DL.

Comments

There are no comments at this time.

 

To comment you must create or log in with your ACM account.