Do AI Detectors Flag ESL Students More? 8 Months of Tracking

16

I teach ESL and ELL students at UBC and I’ve been systematically logging something that I think needs to be part of the policy conversation.

Over the past two terms, I recorded every case where a detection tool flagged a submission as likely AI-generated, then followed up directly with the student. My numbers: ESL students flagged at roughly 2.3x the rate of domestic English-speaking students, even when the follow-up conversation clearly indicated the work was their own.

the pattern is consistent: formal academic syntax, precise but slightly unusual word choices, consistent sentence structure. These are markers of a strong second-language writer drilling academic English. They’re also markers of AI-generated text. The tools can’t distinguish between them – and why would they, when fluency and consistency are exactly what AI produces.

Has anyone else been tracking this? I want to bring this to our department with actual data, not just anecdote. If others are seeing similar ratios, this becomes a much stronger case for policy-level discussion.

4 replies

4 Replies

7

yes, this is real. i see the same pattern in French immersion. students who've been trained to write formal academic French produce writing that scores high on AI detectors even when it's 100% their own. the formality and consistency markers are identical. I've stopped treating detection scores as evidence - they're a conversation starter at best.

15

Holly, this tracks with something that nearly became a formal complaint at our school last year. An international student from South Korea was flagged on a history essay. Her English was genuinely excellent - four years in Canada, extensive tutoring, drilled academic writing formats. Detection score came back at 81%.

her parents were furious. Father mentioned legal options. We backed off entirely because we couldn't actually prove anything, and we shouldn't have flagged her in the first place. Our department stopped using detection as evidence after that. We use it to prompt a conversation, nothing more.

the false positive risk is not theoretical for us anymore. it's institutional reputation, it's parent relations, it's potential legal exposure. departments need to be very careful.

4

ok but this is exactly why equity HAS to be part of the detection conversation. our school can barely afford a working printer but we're making high-stakes decisions based on algorithms that nobody fully understands. who gets flagged matters. it's not a neutral tool.

3

first year French immersion teacher here and this happened to me last month. flagged a student I genuinely knew wasn't cheating - she'd written the essay in front of me during a work period, I'd seen her process. detection score said 76% AI. i panicked and almost brought it to admin. glad i didn't. the tool was just wrong.