AI-generated writing: cheating and its detection

I saw this going viral, How University Students Use Claude. As to be expected, the main issue is cheating:

As I discuss here, AI-generated essays should be easy to detect if the teacher is paying attention, such as the teacher taking a more active role. Other ways to mitigate it include requiring students to submit in-class writing samples to assess baseline ability, in-class participation, or pop-quizzes. Or a zero-tolerance policy as a deterrent.

I think also the problem is most teachers don’t care that much. They may suspect there is cheating, but do not investigate it further for fear of retaliation or falsely accusing someone of cheating, especially if a smoking gun is absent such as an obviously lifted passage.

There are other telltale signs of AI-generated writing:

A significant disconnect between assessed in-class competence and participation compared with what is submitted. For large classrooms, this may be harder to detect. An obvious way to bypass this is lower the grade level of the generated text. If the objective is merely to pass the course, then generating C-quality essays is good enough and less likely to be detected.

Writing that sounds unusually formal or overuses the passive voice. I believe overuse of the passive voice is the main giveaway of AI-generated text, not em-dashes. Or if the writing has internal inconsistencies, such as contradicting something written earlier, or even contains the literal prompt in the text. There are possibly more subtle clues, such as differences of Unicode formats. Copying and pasting the AI-generated text from a textbox onto a separate document may leave certain artifacts, especially with quotation marks or other punctuation.

The writing is almost entirely grammatically correct or free of other errors, such as malapropisms, which are common in human writing, especially as assignment length increases. How many people are careful or astute enough to not make any typos for a long paper? (One way to avoid detection is to have the AI add mistakes, such as misspelling commonly misspelled words). Human writing has various quirks or idiosyncrasies that AI-generated writing may lack.

Overall, if the instructor is paying even a minimal amount of attention, I believe it should be pretty easy to detect AI-generated writing, especially if the instructor already has some baseline assessment of the competence of his or her students. Or the writing just feels ‘off’ even if it’s hard to pin down exactly why. Sure, there is survivorship bias (you only see those who get caught), but in college I didn’t have audacity to try such a stunt (this was well before LLMs, when pre-written papers or ghostwriters were used), as I knew there was a decent likelihood it would be noticed, and sure enough others in my class were caught.