We’ve all felt the creeping suspicion that something we’re reading was written by a big language model — but it’s extremely difficult to detect. For a few months last year, everyone was convinced that certain words like “dip” or “underline” could throw off the models, but the evidence is slim, and as the models have become more sophisticated, the telltale words have become harder to spot.
But as it turns out, the folks at Wikipedia have gotten pretty good at tagging AI-written prose — and the team’s public guide to “AI Writing Points” is the best resource I’ve found to ascertain whether your suspicions are justified. (Credit to poet Jameson Fitzpatrick, who pointed the document to X.)
Since 2023, Wikipedia editors have been working to get a handle on AI submissions, a project they call Project AI Cleanup. With millions of edits coming in every day, there’s plenty of material to tap into, and in classic Wikipedia editing style, the team have created a hands-on guide that’s detailed and fact-heavy.
First, the guide confirms what we already know: automated tools are basically useless. Instead, the guide focuses on habits and turns of phrase that are rare in Wikipedia but common on the internet in general (and thus, common in the model’s training data). According to the guide, AI submissions will spend a lot of time highlighting why a topic is important, usually in general terms like “a pivotal moment” or “a broader movement.” AI models will also spend a lot of time detailing small media points to make the subject seem noteworthy – something you’d expect from a personal CV, but not from an independent source.
The guide points out a particularly interesting quirk around tail clauses with fuzzy importance assertions. Models will say that some fact or detail “highlights the importance” of some element or other, or “reflects the continuing relevance” of some general idea. (Grammar nerds will know this as “present tense”.) It’s a little hard to spot, but once you can recognize it, you’ll see it everywhere.
There is also a tendency towards vague marketing language, which is extremely common on the internet. The scenery is always picturesque, the views are breathtaking and everything is clean and modern. As the editors put it, it “sounds more like a transcript of a TV commercial.”
The guide is worth reading in its entirety, but I came away very impressed. Before that, I’d say the LLM’s prose developed too quickly to be defined. But the habits highlighted here are deeply embedded in the way AI models are trained and developed. They can be disguised, but it will be difficult to remove them completely. And if the general public becomes more knowledgeable about identifying AI prose, it could have all kinds of interesting consequences.
