Creative Commons License

Credit: Tyler Russel / CTPublic

Generative AI (Gen AI) and accompanying Large Language Models (LLMs) have made one reality difficult to ignore: students can produce answers and polished prose in seconds.

With advanced prompts, a student can include details which at face value appear intrinsic but are thoughtless words submitted to circumvent the critical thinking process. With a simple use of a smartphone camera, students can almost instantly use Gen AI for answers on a lecture hall exam.

Universities have responded in different ways. Some have tightened academic integrity policies. Some introduced disclosure requirements. Others have their heads in the sand.

Faculty are increasingly relying on faulty AI detection software in hopes of distinguishing genuine human writing from the likes of Chat GPT, Grok, Gemini, Claude, Co-Pilot, etc.

There are three fundamental questions that must be answered by faculty, governance groups, and university leadership. (1) What are the acceptable uses of Gen AI/LLMs? (2) Is there an effective way to catch students cheating? And (3) Does current methodology assign work that forces students to think?

What are the acceptable uses of Gen AI/LLMs?

As an administrator, I would defer this question to my colleagues in the classroom.

However, I encourage consideration of the value of Generative AI and LLMs as tools that can enhance efficiency and support learning, much like calculators and search engines shifted education by reducing routine tasks and allowing greater focus on analysis, problem-solving, and deeper thinking.

As an example, my knowledge of Excel rivals that of my beloved labradoodle, Bella Grace. While she paws at the spreadsheet in hopes of inputting a formula that makes sense, I can write a simple command in any Gen AI, and boom, done. What would take me hours a couple of years ago now takes me a minute, allowing me the time needed with what goes into the spreadsheet and not the creation of it.

Faculty must wrestle with the differences in opinion on the topic. Can I input original work into Chat GPT for touch up? No? But Microsoft Word does it automatically, right? Red underlines identify misspelling, blue for grammar, that’s not controversial, is it? Why is that different? Can I use Grammarly, the trusted source by institutions like Arizona State? No? Why not?

While some answer quickly, we cannot ignore the significant nuances that must be applied by each discipline. Higher education is the place where ideas are debated, and debated, and further debated. Decisions, however, arrive at a pace comparable to Thomas Jefferson’s death. While the topic must be fully explored, initial decisions need to be in place before another academic year passes us by.

Is there an effective way to identify students cheating with Gen AI?

The information available suggests AI detection tools remain far too unreliable for high-stakes academic decisions. Let me say that again for those in the cheap seats; AI detection is as reliable as a Bernie Madoff investment.

Sloan Technology Services at MIT states “AI detection software is far from foolproof. In fact, it has high error rates and can lead instructors to falsely accuse students of misconduct.” OpenAI, the company behind ChatGPT, shuttered its own AI detection software because of its poor performance.

Earlier this year, in what may become a landmark case, Orion Newby sued Adelphi University, and won, after being accused of using AI on an assignment. According to court records, Turnitin’s AI detector marked Newby’s paper as written entirely by AI, while Newby contended that two other detectors marked the paper as written by a human. Read more about the case here.

While some students generate content using Gen AI/LLMs, we are seeing writing exist across a range of self-authored to AI-assisted, in drafting, editing, brainstorming, and revision. Emma Whitford writes, “according to a 2025 Inside Higher Ed survey of more than 1,000 students, 85 percent had used generative AI to complete coursework. More than half said they used it for brainstorming ideas, 44 percent used it to edit or check their work, a quarter used AI to complete assignments or coding work, and 19 percent used it to write free responses or essays.”

In looking at AI detectors specifically, research indicates they’re far from reliable. A 2026 peer-reviewed study in the International Journal for Educational Integrity evaluated commercial detectors. Gen AI detector accuracy ranged from approximately 61% to 69%, with performance declining as pieces were longer and more specific.

The findings expose a structural limitation; Gen AI detectors are estimating writing patterns and cannot yet establish authentic authorship.

 While some faculty describe detector results as indicators rather than proof, others place a disproportionate value in the scores. In my experience, a flagged percentage alters faculty judgment before a discussion with a student occurs. This creates an asymmetrical risk. A false negative may allow AI use to go undetected, while a false positive can trigger academic misconduct procedures against a student who completed the work independently.

AI detection systems will continue to improve, but writing is variable and writing for courses with specific language structures is, at least for now, the Achilles heel for Gen AI/LLM detection.

Research also raises equity concerns. The Center for Democracy & Technology warns that AI detection systems may disproportionately affect multilingual and English-language-learning populations. Kristin Woelfel states: “research indicates that so-called AI detectors are disproportionately likely to falsely flag the writing of non-native English speakers as AI-generated, putting them at greater risk for being disciplined for cheating in school. Schools need to be aware of this potential disparity and take steps to ensure it does not result in violating the civil rights of EL students.” 

Several universities are pulling back from AI detection; Vanderbilt, Michigan State, University of Pittsburgh, and Northwestern are noted in this article by Kohrman Jackson & Krantz LLP.

 Many other institutions have moved toward limiting AI detector use or emphasizing that detection results should not be treated as dispositive evidence.

 Does our current methodology assign work that forces students to think?

Rather than attempting the impossible elimination of AI, universities must focus on transparent policies, and faculty must re-think assignments and assessments.

The academy must wrestle with the possibilities that lie in the Brave New World, version 2.0. What does that look like? Grading the process as opposed to the outcome for one. For example, if the requirement is a 10-page paper on winemaking in the Chianti Classico region of Tuscany. A redesign could include a proposal, annotated bibliography, preliminary argument map, a first draft, a revised draft, and lastly, the final paper. (And perhaps a paid visit to the region for first-hand verification.)

Simply, AI can produce paragraphs, and sometimes very good ones. AI cannot demonstrate the intellectual development process students engage in when assignments are broken down.

Another example could be to ask students to cite classroom discussions or document and explain sources they relied on and why. Another, have students present and defend their work to the class or during office hours.

Is this ideal? Maybe, maybe not. Again, it’ll be argued longer than Jefferson’s ultimate demise. But like any other profession, higher education leaders must change with the times. And that change must happen now.

Christopher M. Piscitelli is the Associate Dean of Students at Southern Connecticut State University and Finance Chair of the Hamden Board of Education.