ACM Inroads: Archive

CS instructors realize education may be impacted by generative AI (artificial intelligence). This article describes (1) opportunities, like 24/7 help or auto-grading, (2) challenges, like increased cheating or student isolation, and (3) pitfalls to avoid, like inequitable access or false cheating accusations. The descriptions come from 15 CS educators and practitioners, from diverse institutions, who met at a 2023 NSF workshop on AI in CS education. The article aims to assist the CS community to take advantage of generative AI to improve courses, embark on new research, treat students equitably, and ultimately improve CS education for instructors and students.

Introduction

Generative AI has attracted extensive attention since the release of ChatGPT 3.5 in late 2022. Instructor discussions of generative AI's implication range from "nothing really new here" to "education will forever be changed." A group of 15 CS educators, researchers, and practitioners gathered in Denver in August 2023 as part of an NSF-sponsored workshop whose goal was to discuss the power of generative AI, the opportunities it presents, the challenges it creates, and the pitfalls that instructors should avoid. This article summarizes the group's key findings. Though not intended as a literature review, the article does include some references to related literature where appropriate.

Opportunities

Although discussions about AI often focus on challenges such as increased cheating, AI provides a variety of opportunities to improve learning and teaching.

• Tutor

Perhaps AI's greatest opportunity is to serve as a 24/7 non-judgmental tutor. When asked how to improve a course, many students say they wish they could get more help when stuck on a programming assignment or other homework task, especially at the specific time they need help, which might be late at night or on weekends. Course instructors are not available at all hours, while discussion boards may still have long delays. Having to wait for help can cause students to lose valuable time, focus, and momentum. However, AI is nearly always available. One of the most common requests during a teaching assistant or instructors office hours is for help debugging code. AI can help debug code, provide hints on what to try next, suggest improvements, and more. Figure 1 shows an example of a student asking for help finding a bug in a program, and the response in the Figure 1. Readers familiar with C/C++/Java are encouraged to try to find the bug on the left before looking at AI's response on the right.

It looks like there's a syntax error in your code. The semicolon (;) after the for loop's closing parenthesis is creating an empty statement block, and the following block inside curly braces is executed independently of the loop. Here's the corrected code: …

While debugging is a common area where students request help, another area is to simply to understand the problem being assigned. Figure 2 provides another AI tutor example, where the student asks for help understanding an assignment. Although the example assignment is short, AI can be surprisingly good at understanding even lengthy assignment descriptions.

Importantly, many students are embarrassed to ask for help from humans, not wanting to be judged as unintelligent [13,18]. However, no such embarrassment exists when students ask AI for help. As such, AI may enable more students to obtain needed help, even in cases where sufficient human help resources are available.

• Auto-Grader

Receiving rapid feedback is long known to aid learning [19]. However, instructors have limited grading resources, which translates to assigning fewer graded items, and long delays in returning grades. AI may help auto-grade student programs [15], short-answer questions on exams [6], and more. AI may grade for correctness, style, efficiency, etc. Such grading enables more thorough assessments, while freeing teacher time for higher-value activities as well. Figure 3 provides a simple example.

Students may have access to such grading, so they get quick feedback, which aids in learning [11], and also provides an opportunity to quickly correct issues and resubmit for a better score.

On a related note, AI may help instructors detect trends in student work, help the instructors target their instruction. Figure 4 provides an example of determining common errors in a set of programs; an instructor might discuss those common errors in class, which is known to improve learning [16].

Other examples might include partitioning into groups of similar solutions—as done for example in the Gradescope tool [9]—to assist with grading or to inform instruction.

• Personalized Instruction

While AI may serve as a tutor providing help, AI can go even further by providing instruction, with that instruction being personalized to the student [1]. AI may move more slowly through topics for some students, and more swiftly for others. AI may determine areas in which a student needs strengthening and review such areas. Figure 5 provides an example of a student learning binary search, with AI personalizing the instruction to the student.

AI may adjust the instruction to the student's interests as well. Based on a student's history of demonstrated interests, the AI might choose a health-related example, a business-related example, a sports-related example, etc.

The above conversation could involve verbal communication as well, with the AI output being generated into speech, and the student's questions being spoken back to the AI.

Some CS educators have begun embracing approaches that use AI actively in learning programing, such as discussed in [5,17].

Such personalized instruction and help by AI may reduce equity gaps for students who otherwise are less able to hire a tutor, attend classes or office hours, spend as much time as classmates on homework, etc.

Challenges

While AI provides many opportunities for improved teaching and learning, AI also provides challenges that may hinder the learning process.

• Cheating

AI makes "cheating" on homework assignments even easier than before. A recent analysis showed that a student copy-pasting CS1 programming assignments into ChatGPT and copy-pasting results, for an entire 15-week semester, could achieve a score of 96%, while reducing 50+ hours of work to just over 1 hour [22]. If students are still expected to do homework assignments mostly on their own to benefit their own learning, steps will need to be taken to ensure students are actually doing the work.

One approach is with stronger tools to detect cheating, looking beyond similarity with classmates or online code, to consider more facets such as departing from a class's programming style, having inconsistent style across one's own programs, doing work far faster than classmates, etc. Another approach is to shift grading from focus solely on the final solution to focus more on the steps leading to the final product, which might include starting early, spending sufficient time, developing incrementally, testing one's own solutions. It may also include descriptions of rationale, presenting one's solution and answering questions, etc. Many other cheating-reduction approaches exist too, such as establishing rapport with students, carefully scaffolding work, providing opportunities to resubmit, discussing integrity, and more.

• Less "Work" in Homework

AI may require entirely changing how homework is treated, as it may become clear that young students cannot resist using AI at home. Instructors may decide that unsupervised homework no longer provides a learning experience and may thus shift student work to be done in supervised settings. This may require more scheduled labs for each class, more on-campus drop-in or scheduled lab rooms where work is overseen by proctors, more online monitoring while students are working—such as being watched by a proctor or being video recorded with actions later analyzed by humans or automated tools, etc. Of course, increasing supervision introduces its own challenges including increased student stress, more institutional cost, less accessibility for students who cannot come to campus as much or whose home internet is weak, etc.

A related response is already seen today, wherein instructors are shifting grade weight away from homework and towards proctored exams. Without modifying the approach to exams, this shift may see more students fail, wherein many students won't learn from the homework and then do poorly on the exams. Thus, instructors may also shift their approach to exams, which may include more exams ranging from low to higher stakes, allowing students to take practice exams, having policies around re-taking exams, and more. Instructors may also adopt more mastery-based techniques, where students progress at different paces so take exams at different weeks, where students keep re-taking exams until mastering a topic before moving forward, where grades are not based on performance across all of a course's topics but rather on the number of topics mastered, etc. Such changes may benefit from auto-generated auto-graded exams, computer-based testing facilities, and other features that will require more exam-related resources from universities [20,23].

AI's ability to provide quick help might reduce critical thinking in students and prevent students from gaining the ability of productive "struggle," both of which may harm students in later studies or careers where AI cannot always help.

Instructors may assume students will be using AI on any homework and embrace that situation, designing learning experiences that incorporate AI. New approaches to learning may be discovered—perhaps having access to nearly unlimited examples, customized to a course or even to a student, if harnessed appropriately may lead to even better learning due to humans doing well learning from examples [21].

• Less Critical Thinking

AI's ability to provide quick help might reduce critical thinking in students and prevent students from gaining the ability of productive "struggle" both of which may harm students in later studies or careers where AI cannot always help.

AI may weaken students' respect for university education, as they question why they should learn to do things that AI can easily do. A calculator analogy was commonly used by panelists—though calculators exist, it is still important to teach kids to manually and mentally do arithmetic.

• Hallucination and Bias

AI is known to sometimes "hallucinate" [12,14], confidently giving wrong answers. The thoroughness of the wrong answer, and the authoritative confidence and clarity with which the answer is conveyed, can go well beyond the occasional incorrectness of a human tutor. Leading a student down a wrong path can lead to fundamental problems later based on misconceptions. It can also cause tremendous frustration among students, who might have to go back and redo an entire assignment, for example.

AI may sometimes provide biased answers as well, potentially providing a limited perspective on a topic, or worse providing a racist, sexist, or other problematic answer.

Pitfalls to Avoid

New technologies often have pitfalls that initially can be hard to detect until problems become evident. The following are some key pitfalls that CS educators may wish to avoid, before such problems arise.

• Inequitable Access

Underprivileged students may not have easy access to AI. As such, overreliance on AI in courses might hurt underprivileged students. If AI is required in a course, instructors may wish to ensure all students have reasonable access.

AI can be expensive today when personalized to a course. As such, the gap in education quality between rich and poor schools may increase if richer schools are able to introduce AI tutors, AI auto-graders, and more, while poorer schools continue to use techniques that provide less help and less feedback.

A famous saying warns that predictions are hard, especially about the future. AI has captured the attention of CS educators, with predictions varying drastically. Like many new technologies, it can be hard to know what direction a technology will take.

Many digital technologies evolved without consideration of students with disabilities, such as vision-impaired students or hearing-impaired students, known as accessibility. Retrofitting such technologies to support screen readers, or to include captioning, can be a time-consuming and difficult process, and may lead to inferior student experiences too. Creators of AI technology may do well to consider accessibility from the start.

• Isolation

Students already suffer from isolation, which can reduce their success [2]. In fact, loneliness is today considered a health crisis of historic proportions [8], fueled by isolation during the pandemic, increased use of social media instead of in-person human interactions, more recreation done via online sources like streaming and gaming, etc. AI may further reduce the need to interact with classmates and teachers, and thus can further increase isolation, leading to poorer performance, more depression and loneliness, and dropout. A recent study highlighted the isolation problem when AI is used [4].

• False Accusations

Tools that claim to detect AI-generated code or essays can be quite inaccurate. Compared with the traditional automated cheat detection tools that seek plagiarism, namely similarity between a student's solutions and that of classmates or online solutions, AI-detection is far less accurate. AI-detection tools often suggest that an item is AI generated when no AI was used; some detectors say the U.S. Constitution was AI generated [7]. Even today, usage of AI-detection tools is leading to numerous false accusations with devastating effects on students [3]. Such tools should be used with great care, perhaps helping to start an investigation, but maybe not as actual evidence itself.

• Rising Costs

History has observed various companies provided new technologies for free, but then after the technology has become entrenched and difficult to switch away from, the companies begin charging. Instructors may wish to note that companies providing AI for free low cost today might later introduce fees or raise prices, creating problems for students and universities.

• Non-Diverse Perspectives

AI that adapts to the student may pigeonhole the student, leading to a limited or biased learning experience. AI may also lead to a more "homogenous" learning experience that loses the benefits of the diversity that comes from different classmates, tutors, and teachers.

Summary

A famous saying warns that predictions are hard, especially about the future. AI has captured the attention of CS educators, with predictions varying drastically. Like many new technologies, it can be hard to know what direction a technology will take. Few in the late 1990s predicted that cell phones would lead to video recording/watching/rating becoming such a central part of young people's lives, for example. With this understanding of the limitations of predicting directions of new technologies, a group of CS educators and others sought to highlight some key opportunities, challenges, and pitfalls to avoid, at least from their current vantage point. Those highlights are summarized in Figure 6. CS educators continue to discuss the issues in various forums, as in [10]. Overall, the group was excited about the future of AI having a net positive impact on CS teaching and learning.

Acknowledgments

The authors gratefully acknowledge the National Science Foundation for supporting the workshop via NSF Award 32332345.

References

1. Adiguzel, T., Kaya, M.H. and Cansu, F.K., Revolutionizing education with AI: Exploring the transformative potential of ChatGPT. Contemporary Educational Technology, 15 (3), 2003, p.ep429.

2. Aulck, L., Velagapudi, N., Blumenstock, J. and West, J., 2016. Predicting student dropout in higher education. arXiv preprint arXiv:1606.06364.

3. College Student Advocates criticizes over-reliance on new AI-detection tools. College Student Advocates, https://www.collegestudentadvocates.org/post/college-student-advocates-criticizes-over-reliance-on-new-ai-detection-tools; accessed 2024 Aug 1.

4. Crawford, J., Allen, K.A., Pani, B. and Cowling, M., When artificial intelligence substitutes humans in higher education: the cost of loneliness, student success, and retention. Studies in Higher Education, 2024 pp.1–15.

5. Denny, P., Leinonen, J., Prather, J., Luxton-Reilly, A., Amarouche, T., Becker, B.A. and Reeves, B.N.,. Prompt Problems: A new programming exercise for the generative AI era. In Proceedings of the 55th ACM Technical Symposium on Computer Science Education, March 2024, v. 1 (pp. 296–302).

6. Fowler, M., Chen, B., Azad, S., West, M. and Zilles, C., 2021, March. Autograding" Explain in Plain English" questions using NLP. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education (pp. 1163–1169).

7. Fowler, G. What to do when you're accused of AI cheating. Washington Post, Aug 14, 2023, https://www.washingtonpost.com/technology/2023/08/14/prove-false-positive-ai-detection-turnitin-gptzero/; accessed 2024 Aug 1.

8. General, U.S., 2023. Our epidemic of loneliness and isolation. The US Surgeon General's Advisory on the Healing Effects of Social Connection and Community 2023.

9. Gradescope, AI-assisted answer groups; https://help.gradescope.com/article/mv8qkiux00-instructor-assignment-ai-grading-answer-groups. Accessed 2023 Dec.

10. Hazzan, O. and Erez, Y., Generative AI in Computer Science Education. In Proceedings of the 55th ACM Technical Symposium on Computer Science Education 2024; https://doi.org/10.1145/3626253.363340.

11. Irons, A. and Elkington, S., Enhancing learning through formative assessment and feedback. Routledge, 2001.

12. Leiser, F., Eckhardt, S., Knaeble, M., Mädche, A., Schwabe, G. and Sunyaev, A., From ChatGPT to FactGPT: A Participatory Design Study to Mitigate the Effects of Large Language Model Hallucinations on Users. Proceedings of Mensch und Computer, 2023.

13. Li, R., Che Hassan, N. and Saharuddin, N., College Student's Academic Help-Seeking Behavior: A Systematic Literature Review. Behavioral Sciences, 2023, 13 (8), p.637.

14. Li, T.W., Hsu, S., Fowler, M., Zhang, Z., Zilles, C. and Karahalios, K., 2023. Am I Wrong, or Is the Autograder Wrong? Effects of AI Grading Mistakes on Learning. https://doi.org/10.1145/3568813.3600124.

15. .Messer, M., Brown, N.C., Kölling, M. and Shi, M., Machine Learning-Based Automated Grading and Feedback Tools for Programming: A Meta-Analysis. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education June 2023 v. 1 (pp. 491–497).

16. Muller, D.A., Bewes, J., Sharma, M.D. and Reimann, P., Saying the wrong thing: Improving learning with multimedia by including misconceptions. Journal of Computer Assisted Learning, 24(2), 2008, pp.144–155.

17. Porter, L., Learn AI-assisted Python Programming: With GitHub Copilot and ChatGPT. Simon and Schuster, 2024

18. Price, T.W., Liu, Z., Cateté, V. and Barnes, T., August. Factors influencing students' help-seeking behavior while programming with human and computer tutors. In Proceedings of the 2017 ACM Conference on international computing education research, 2017, (pp. 127–135).

19. Sadler, D.R., Formative assessment and the design of instructional systems. Instructional science, 1989, 18, pp.119–144.

20. Smith IV, D.H., Emeka, C., Fowler, M., West, M. and Zilles, C., Investigating the Effects of Testing Frequency on Programming Performance and Students' Behavior. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education, March 2003, v.. 1 (pp. 757–763).

21. Sweller, J. and Cooper, G.A., The use of worked examples as a substitute for problem solving in learning algebra. Cognition and instruction, 1985, pp.59–89.

22. Vahid, F., Areizaga, L., Pang, A. ChatGPT and Cheat Detection in CS1 Using a Program Autograding System. Whitepaper, 2023, https://www.zybooks.com/research-items/chatgpt-and-cheat-detection-in-cs1-using-a-program-autograding-system/; accessed 2024 Aug 1.

23. Zilles, C.B., West, M., Herman, G.L. and Bretl, T., Every University Should Have a Computer-Based Testing Facility. In CSEDU (1) May 2019 (pp. 414–420).

Author

Frank Vahid
Dept. of Computer Science and Engineering
University of California, Riverside
[email protected]

Figures

Figure 1. Example of AI acting as a "tutor," helping a student debug code. See response at https://chat.openai.com .