Louise Stivers has always had her homework checked by software.
A 21-year-old political science major from Santa Barbara about to graduate at University of California, Davis, with plans to attend law school, Stivers grew up in an age when students are expected to submit written assignments through anti-plagiarism tools such as Turnitin. Educators rely on these services to evaluate text and flag any that appears to be copied from existing sources, and she never ran afoul of one. But these days, school administrations are also on the lookout for assignments completed with generative AI. That’s how Stivers wound up in trouble — even though she hadn’t cheated.
More from Rolling Stone
“I was, like, freaking out,” she tells Rolling Stone.
Hours after she uploaded a paper for one of her classes in April (it was a brief summarizing a Supreme Court case), Stivers received an email from her professor indicating that a portion of it had been flagged by Turnitin as AI-written. Already, her case had been forwarded to the university’s Office of Student Support and Judicial Affairs, which handles discipline for academic misconduct. Stivers was baffled and stressed but immediately afterward had to take a quiz for a different class. Then she began gathering evidence that she’d written the brief herself.
“It was definitely very demotivating,” Stivers says of the accusation, which spiraled into a bureaucratic nightmare lasting more than two weeks as she sought to clear her name. She was expected to parse the school’s dense cheating policies and mount a formal defense of her work, with little institutional support and while facing the ordinary pressures of senior year. She calls this a “huge waste of time” that could’ve been “spent doing homework and studying for midterms.” Because of this split focus, she says, her grades began to slide.
Stivers is hardly alone in facing such an ordeal as students, teachers, and educational institutions grapple with the revolutionary power of artificially intelligent language bots that convincingly mimic human writing. Last month, a Texas professor incorrectly used ChatGPT in trying to assess whether students had completed an assignment using that software. It claimed to have written every essay he fed in — so he temporarily withheld a whole class’ final grades.
In fact, Stivers learned she wasn’t even the first UC Davis student to contend with a false allegation of AI cheating. Days before she learned that she would be subjected to an academic integrity review, USA Today published an article about William Quarterman, a senior and history major at the college. A professor had run his exam answers through an AI detection called GPTZero, which returned a positive result. The professor gave Quarterman a failing grade and referred him to the same student affairs office that would adjudicate Stivers’ case. Stivers soon learned of her classmate’s plight. “His dad was able to help him,” she says, and the two in turn gave her “a lot of advice, and kind of explained how the school policy works about this, too.”
Even so, the stakes felt intimidatingly high. The initial “very long” email Stivers received explained when she would have a chance to tell her side of the story in a one-on-one Zoom conversation with an administrator, but was short on details as to the review process, or exactly what she needed to prepare. It also said she could have a lawyer present for the call, which alarmed her. By and large, she felt “in the dark” about what to do, not to mention confused as to the possible consequences of this evaluation.
“I was already very burned out from the past two quarters,” Stivers says. “And when you’re applying to law school, it’s a lot of pressure to keep up your GPA. Yeah, it’s just not fun to have to figure out the school’s complicated academic integrity policies while doing classes.” In fact, it became something like additional, extra-stressful homework.
When she did talk to the faculty moderator assessing her case, Stivers learned that Turnitin’s AI detection tool was actually brand new, and that UC Davis had secured “early access” to the software. Turnitin advertises this product as 98 percent accurate, but in their materials acknowledges “a small risk of false positives.” The company emphasizes that they are not responsible for determining misconduct. Instead, they “provide data for educators to make an informed decision.”
In a statement to Rolling Stone, Annie Chechitelli, chief product officer at Turnitin, encouraged Stivers to get in touch with feedback about the software, saying that “this information is very helpful to us as we continue to refine and develop our detection capabilities.” Chichitelli added, “In all cases, faculty and teachers decide if what they see in the information presented to them warrants further attention.”
In the end, Stivers was able to prove that she wasn’t in violation of the university’s rules, having shared painstaking “step-by-step instructions on how to open Google Docs and review history” with the judicial department. Those time stamps demonstrated that she’d written the paper herself.
However, Stivers points out that the allegation of cheating is something she’ll have to self-report to law schools during the application process. State Bar associations, she says, are known to ask similar questions about academic history, meaning this misunderstanding could shadow her for years. Indeed, U.S. News & World Report advises law school and State Bar applicants to “err on the side of disclosure” and proactively report any “disciplinary procedures at their college” on the assumption that these can turn up in background checks. And, she says, the decision in her favor came down without an apology or acknowledgement of the mistake from her professor or the college itself.
In a statement to Rolling Stone, the UC Davis Office of Student Support and Judicial Affairs said that the Family Educational Rights and Privacy Act precludes them from commenting on individual student cases. However, the department confirmed recent updates to its policies to address the evolving issue of artificial intelligence, and it’s “planning a fall campaign to increase awareness about AI and academic conduct, encourage conversation among students and instructors, and provide a guide for students.” As for the new Turnitin AI detection tool, the office is “continuing to evaluate its utility while not relying on it or any other one method,” relying on “a variety of tools, along with our own analysis of the student’s work,” to reach decisions on misconduct.
“Obviously, people are going to use it, students are going to use it, professors are going to use it,” Stivers says of tools like ChatGPT, which she believes has made educators “paranoid” about AI cheating. “But I think they just need to be more careful with how they approach it.” Stivers says Turnitin’s plagiarism and AI detection clearly diverge in functionality, though professors may treat the outcomes as equally reliable. In her statement, Chechitelli noted that the AI detection tool delivers results with an “indicator score” showing the percentage of non-original text in a document, similar to what educators get from their plagiarism detection tool. However, on the AI side, scores are “based on statistics rather than on a comparison against source documents.”
Vincent Conitzer, director of the Foundations of Cooperative AI Lab at Carnegie Mellon University and head of technical AI engagement at the University of Oxford’s Institute for Ethics in AI, breaks down this crucial difference.
“For plagiarism, there are automated tools to detect it, and there can be gray areas where it’s not entirely clear whether the student really cheated,” he says. “But in that case, it is easy for instructors and other university staff to assess the evidence directly” by matching a student’s language to the source they copied from. In contrast, he says, “if a tool simply claims that some fragment of text is AI-generated, but without any evidence that is interpretable by instructors or university staff, they would have to have a very high degree of confidence in the tool itself to accuse the student.” He questions how much faith to put in figures like Turnitin’s “98 percent” accuracy rate, especially when you consider internal testing versus application in the classroom.
“And real life can be messier still,” Conitzer says, describing scenarios where students “write the initial draft themselves but then ask ChatGPT to find ways to improve the writing, or conversely ask ChatGPT to write a first draft and then rewrite it themselves, or some combination of the two.” Such inevitabilities, in his opinion, illustrate the need for clear and enforceable academic policies. “Generally, this is likely to continue to be an arms race,” he concludes.
And, as professors and schools get increasingly serious about cracking down on AI cheats (while perhaps overestimating the efficacy of detection software), more students like Stivers will be caught in the crossfire, with their academic lives upended. Not even the fact of her own innocence was enough to put her mind at ease while she awaited a disciplinary ruling.
“I knew I didn’t cheat,” Stivers says. “But at the same time was like, ‘Well, I don’t know 100 percent if they’re actually going to believe me.'”
Update, June 7, 12:50 pm: This story has been updated to include comment from the UC Davis Office of Student Support and Judicial Affairs.
Best of Rolling Stone