What I doubt most about this shift of "forget writing code or reviewing it you shouldn't even look at it" (their tagline was "review demos, not diffs") is the ignorance of scope-drift. I use agentic tools all day and I can tell you I would absolutely not trust an agent to run for hours without supervision because it is very likely that over the course of HOURS (even with a fully detailed structured plan with .md files and loaded preferences) the agent will have drifted substantially from your initial request.
The biggest attestation to this is: When Claude is done working on something for you and you haven't told defined the next steps - ask it what you should do next. See if it at all aligns with what you actually wanted to do.
Good report, very important thing to measure and I was thinking of doing it after Claude kept overriding my .md files to recommend tools I've never used before.
The vercel dominance is one I don't understand. It isn't reflected in vercel's share of the deployment market, nor is it one that is likely overwhelming prevalent in discourse or recommended online (possible training data). I'm going to guess it's the bias of most generated projects being JS/TS (particularly Next.js) and the model can't help but recommend the makers of Next.js in that case.
Yeah exactly, it's best to keep track and be aware of common tropes used in AI writing so that you don't end up 5 responses deep and emotionally invested in a conversation before you realise you've been fooled into speaking to a bot.
I built this tool primarily to identify AI writing in articles and posts but it's proven useful for comments/responses too: https://tropes.fyi/vetter
This is interesting because it is largely a set of good writing advice for people in general, and AI likely writes like this because these patterns are common.
Not least because a lot of these things are things that novice writers will have had drummed into them. E.g. clearly signposting a conclusion is not uncommon advice.
Not because it isn't hamfisted but because they're not yet good enough that the links advice ("Competent writing doesn't need to tell you it's concluding. The reader can feel it") applies, and it's better than it not being clear to the reader at all. And for more formal writing people will also be told to even more explicitly signpost it with headings.
The post says "AI signals its structural moves because it's following a template, not writing organically. But guess what? So do most human writers. Sometimes far more directly and explicitly than an AI.
To be clear, I don't think the advice is bad given to a sufficiently strong model - e.g. Opus is definitely capable of taking on writing rules with some coaxing (and a review pass), but I could imagine my teachers at school presenting this - stripped of the AI references - to get us to write better.
If anything, I suspect AI writes like this because it gets rewarded in RLHF because it reads like good writing to a lot of people on the surface.
EDIT: Funnily, enough https://tropes.fyi/vetter thinks the above is AI assisted. It absolutely is not. No AI has gone near this comment. That says it all about the trouble with these detectors.
These patterns overlap with formal writing advice because AI was trained overwhelmingly on academic papers, journals and professional writing so it inherited this style.
I completely understand - and do not intend to disparage - the use of these tropes. With the vetter and aidr tools I try to focus more on frequency analysis. I've tried to minimise false positives by tuning detection thresholds to match density rather than individual occurrences e.g. "it's not X, it's Y" is fine but 3x in one paragraph and suspicions flare.
But other tropes like lack of specificity and ESPECIALLY AIs tendency to converge to the mean (less risk, less emotion, FALSE vulnerability) are blatantly anti-human imo.
I'd argue most of them I overlap less with academic writing advice than high school level writing advice. Most people don't transcend that because they have no need to, and it's where most people learn to write essays.
These tropes emerge from the distribution of the LLM itself and from my experimentation it's actually very difficult to get an LLM to change its language. Especially when you consider they've been RLHFed to the max to speak the way they do.
Changing the style is easy: Just feed it a writing sample, and tell it to review its own writing against the style of the writing sample.
That won't entirely weed out these tropes, but it will massively change the style.
Then add a few specific rules and make it review its writing, instead of expecting it to get it right while writing.
To weed out the tropes is largely a question of enforcing good writing through rules.
A whole lot of the tropes are present because a lot of people write that way. It may have been amplified by RLHF etc., but in that case it's been amplified because people have judged those responses to be better - after all that is what RLHF is.
tldr: We took our hypertuned coding agent trained it on millions of internal data engineering workflows and data, with specialized custom-built tools, and it only managed to complete 3 more tasks than Claude Code (out of 43) on a super niche domain-specific benchmark.
Glad we're moving in this direction, I've also got a tool that I use to determine if writing is AI using common tropes and reconstruct the OG prompt from it: https://tropes.fyi/aidr
Engagement is great if you target a specific group. Don't need human content. It's ridiculously easy to start a Facebook page in a niche targeting a specific demographic, connect a site to it, unleash AI generated content, post it on FB and run ads. With enough traction, Facebook will pay you for making more content, while you extract money from your page followers. You're separating easy-to-influence boomers and conspiracy theorists from their money. It's disgusting, but it is ridiculously easy to make heaps of money with whatever content on Facebook.
Yes, a positive from this is those with authenticity and taste will shine. Self-expression will be a form of resistance and we'll see a lot less homogenisation across things like writing, ui/ux, animation, individual websites, blogs.
Who knows maybe the old, scattered, personable, decentralised internet will come back - things like MySpace, geocities, sites like this (a lost art): https://www.cameronsworld.net/
Also taste comes from your ability to steer a model instead of having it steer you. e.g. a model suggests a basic pill button, you push back and curse it for its blandness and use it to design something new and novel.
Why would anyone bother creating or publishing anything new on the internet now that we know that AI companies are just waiting to hoover it up, without compensation, to enrich their models?
Seeing how predatory these companies are in their scraping and then continuing to publish where they can scrape is the absolute height of stupidity
I'd like to see the internet return to those who aren't putting it out there for money, so AI companies (and anyone else) hoovering it up wouldn't bother them. Sharing should be the point.
Why wouldn't it bother you even if you weren't putting it put there for money?
Sharing is great. Having everything you share taken and monetized/weaponized is terrible
I'm looking for ways to build community that is resilient against LLMs, both scraping and also contributing. Unfortunately (or fortunately depending on your point of view) that means it can no longer happen online
I use LLMs in my fiction writing; and before the wolves come out to shred me to pieces: The LLM never gets to see my writing and doesn't do any of the writing for me. I use LLMs in other ways.
One of the first uses I discovered was to have it identify my own blandness. I'll give it a general scenario from my writing and ask it for ten resolutions to that scenario. If my own resolution appears, I realize at best my resolution is bland and at worst cliche.
This is eerily similar to something I do with Hacker News stories that hit the front page. I run the post against a couple of LLMs (Mixtral, GPT-OSS, Qwen3, etc.) with the directive to produce a set of 20 of the most likely top-level replies.
I then wait a few days, and then use a couple of systems (embeddings, deBERTa, etc.) to rank comments by novelty against the LLM-produced replies.
Sorry to hijack this thread to promote but I believe it's for a good and relevant cause: directly identifying and calling out AI writing.
I was literally just working on a directory of the most common tropes/tics/structures that LLMs use in their writing and thought it would be relevant to post here: https://tropes.fyi/
This is a very bad idea unless you have 100% accuracy in identifying AI generated writing, which is impossible. Otherwise, your tool will be more-often used to harass people who use those tropes organically without AI.
This behavior has already been happening with Pangram Labs which supposedly does have good AI detection.
I agree with the risks. However the primary goal of the site is educational not accusatory. I mostly want people to be able to recognise these patterns.
The WIP features measure breadth and density of these tropes, and each trope has frequency thresholds. Also I don't use AI to identify AI writing to avoid accusatory hallucinations.
I do appreciate the feedback though and will take it into consideration.
As much as I'd like to know whether a text was written by a human or not, I'm saddened by the fact that some of these writing patterns have been poisoned by these tools. I enjoy, use, and find many of them to be an elegant way to get a point across. And I refuse to give up the em dash! So if that flags any of my writing—so be it.
Absolutely vibe coded, I'm sure I disclosed it somewhere on the site. As much as I hate using AI for creative endeavours I have to agree that it excels as nextjs/vercel/quick projects like this. I was mostly focused on the curation of the tropes and examples.
Believe me I've had to adjust my writing a lot to avoid these tells, even academics I know are second guessing everything they've ever been taught. It's quite sad but I think it will result in a more personable internet as people try to distinguish themselves from the bots.
> It's quite sad but I think it will result in a more personable internet as people try to distinguish themselves from the bots.
I applaud your optimism, but I think the internet is a lost cause. Humans who value communicating with other humans will need to retreat into niche communities with zero tolerance for bots. Filtering out bot content will likely continue to be impossible, but we'll eventually settle on a good way to determine if someone is human. I just hope we won't have to give up our privacy and anonymity for it.
This matches how LinkedIn feeds and thougth-leadership-pieces looked even before AI.
It's not just AI generated -- it's human slop. And here's the kicker: people believe it works.
The reality is simple. People try to sound smart so they can sell something. They don't know how to write. Don't care. Don't have time. They use the tool at hand. They copy-paste.
The biggest attestation to this is: When Claude is done working on something for you and you haven't told defined the next steps - ask it what you should do next. See if it at all aligns with what you actually wanted to do.
Now imagine that compounded for hours.
reply