AI Detection for ESL Students: The False Positive Problem
This is something I feel strongly about, and I think it’s one of the most important equity issues in the AI detection conversation.
AI detectors disproportionately flag writing by non-native English speakers as AI-generated. This has been documented in multiple studies, and I’ve seen it repeatedly in my own classroom.
Why does this happen? AI detectors look for patterns associated with AI text, primarily low perplexity (predictable word choices) and consistent sentence structure. ESL writers often exhibit these same patterns for completely different reasons. When English isn’t your first language, you tend to rely on common, high-frequency vocabulary. You use sentence structures you’re most confident in. You may write in a more formulaic way because that’s how you were taught.
The result: legitimate student work gets flagged, and ESL students face a disproportionate burden of suspicion. This is not a minor issue. It can damage student confidence, create adversarial relationships between teachers and students, and reinforce feelings of not belonging.
What can schools do about it?
Raise the threshold for ESL students. If your policy triggers investigation at 30% AI detection, consider raising it to 60% or higher for ESL students, or not using detection scores as the primary indicator at all.
Rely more heavily on process-based evidence. Writing portfolios that show a student’s development over time are far more reliable than any single detection scan.
Educate teachers about this bias. Many teachers don’t know that AI detectors have this documented weakness. Professional development should cover it explicitly.
Advocate for better tools. Push tool makers to address this bias. If enough educators demand it, companies will invest in solutions.
Consider alternatives to AI detection entirely. For ESL students, oral assessments, presentations, and in-class writing may be fairer evaluation methods.
How does your school handle AI detection for ESL students?
5 Replies
Join the discussion.
Log In to Replylol my own writing got flagged at 38%. i've been teaching for 12 years. these tools need work
I presented findings similar to these at our board's professional development day last month. The administration was genuinely surprised by the false positive rates. We're now revising our academic integrity policy to require corroborating evidence beyond detection scores alone.
Can you share more about how you set this up? I want to try something similar with my classes next term!
This is EXACTLY what I needed. I've been trying to explain to my department head why we can't just rely on Turnitin scores and now I have actual data to back it up. Sharing this with my whole team tomorrow!
I'd be interested to see your data on this. Would you be willing to share at a future PD session? I think our department would benefit from hearing about your approach.