Judging by the media (which I do not consider a trusted reliable source), there is a panic on campuses about student’s cheating by using AI. But this problem strikes me as trivially easy to solve. Frankly, I suspect that what students learn about AI by cheating is probably more valuable than what the professors were trying to teach them, anyway.
If I were a professor, I would care about the student’s problem-solving process. Let us use as an example freshman economics, a subject that combines technical mathematical reasoning with critical thinking skills.
Suppose that I could interview the student, and suppose that I do this remotely. I do not want the student using AI to cheat (in some situations, I might allow or even encourage the student to use AI during the interview, but let us set that aside for now). So the student is in a setting where a monitor is paid to ensure that the student is not using an AI or cheating in some other way. The monitor can be quite low paid, and the monitor can watch more than one student at a time. You do not need a highly paid professor to be the monitor.
During the interview, I want to make sure that the student understands key concepts and uses them correctly. Concepts like comparative advantage, opportunity cost, returns to scale, production possibility frontier, and supply and demand. I might give some typical problems to analyze. For problems involving supply and demand, the student would have to show understanding of the difference between shifting a curve and moving along a curve.
The interview assessment is the opposite of a multiple-choice exam. The student cannot give answers that are correct only by accident. The student cannot make a careless mistake that signals a lack of understanding of a concept when the student actually knows it well.
The interview is an open-ended conversation. The student can self-correct. I can probe further. At the end of the interview, I am in a good position to know what the student got out of the course and what the student missed.
In the Swarthmore Honors Program, outside examiners are used. It is not the professor doing the assessment. It is an examiner from another college or from industry. And the examination includes an in-person interview. Note that each Honors seminar consists of 6-8 students, so that the burden on the examiner is not so great.
Let an AI Conduct the Interview
In my vision, which is part of my seminar project, the professor feeds into the AI the course syllabus, the key concepts that a student is expected to master, the background materials used in the course, and the desired assessment methods and metrics. With that information, the AI will be able to conduct the interview and make the assessment.
One advantage of this approach is that it scales well. A professor can teach hundreds of students and not have to hand off the grading task to teaching assistants. Because the assessments are all done by a single entity, they will be consistent across students.
But the big advantage is that the assessment can be thorough. An interview can assess the student’s reasoning about course concepts and ability to apply them (also known as knowledge transfer).
I do not believe that anyone can design an assessment process that is perfect. But I would wager that an assessment conducted by an AI using an interview, with a human monitor on site to deter cheating, could outperform the vast majority of assessments currently being used.
I scaled a mini-version of this with AI for my course:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4975804
So, to clarify, your vision requires 1) oral exams (presumably more than one over the course of the semester) with individual students using a paid monitor looking over their should for 15 to 20 minutes? OR 2) designing an AI capable of quizzing a student orally? Number 1 is completely impractical for 99% of colleges that are not Swarthmore...how long would it take to individually interview a typical class of 40+ students multiple times in multiple courses? Number 2 overstates the ability of current models and, more importantly, understates the cleverness and willingness of most students to game the system. They will simply open a new tab with a different AI program to answer the questions. There is no logistical or technical approach to accurately assessing students in the age of AI short of having them writing with pen and paper in the classroom. When looking at the problem honestly, it is far from a trivial problem to solve and I would say closer to a full blown crisis of meaning in education.