Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The false positive rate of AI detectors and its effect on freelance writers (authory.com)
218 points by eh-eric on Nov 1, 2023 | hide | past | favorite | 165 comments


There's a key sentence hidden right at the end of this article:

> They were terrified of Google downranking any AI-generated content, and because of the apparently clear results from their AI checker, they couldn’t be convinced otherwise.

It sounds to me like the client might even have believed the evidence that he provided them showing that he didn't use AI to write the work...

... but they're in the SEO game, and they were paranoid about a (so far hypothetical? I don't think Google have announced this yet) AI detection mechanism that Google might use to downrank them.

At which point whether or not he used AI isn't actually the issue - if his writing style is indistinguishable from AI, then in the client's mind the risk from another hypothetical AI detector within Google means that they shouldn't use the work.


This pattern has precedent.

I scan releases with a bunch of Anti-Virus software. Not to find any virus, of course, but if one of the scanning softwares "detects" a virus, some customer is also going to "detect" the virus, and we have a problem.


Yeah. And if you want them to fix their heuristics they tell you to pound sand unless you are willing to pay them a consulting fee to "work with you". Sounds like a protection racket to me, but for some reason it seems to be legal. Will be "great" if we get the same now with "AI detector" companies.


Have you tried Virustotal? Automates pretty much this task.

I’ma Googler, but had to try five search queries to find the name. Obviously didn’t try this in ages, but it used to be straightforward (and list which AV software was used in the scan).


Is it becoming less of a problem nowadays?

Windows has some good enough built in stuff, right? I can’t imagine using a third party scanner.


I think this is an example of a cargo cult in the original sense: people who didn't understand how and why cargo planes came and went recreating what they saw the people with the cargo planes do in order to make them appear again after those people had left.

A lot of institutional "wisdom" in IT is that you need a virus scanner if you run Windows and because Defender ships with Windows it is therefore part of Windows and not a virus scanner. This isn't really based in anything tangible or empiric, just a learned knowledge that Windows needs anti-virus software and that there is good and bad anti-virus software and that's all you need to understand. Of course AV vendors are very guilty of feeding into this because their income depends on this lack of knowledge and being perceived as doing something vitally important.

Realistically the biggest attack vectors these days are (aside from server software vulnerabilities and browser vulnerabilities) someone launching a file they shouldn't have while ignoring every warning message along the way or entering their credentials into a phishing site or telling people things that they shouldn't know. None of these can really be helped with commercial AV software which is why a lot of AV software has moved on to MITM'ing SSL connections to intercept dodgy e-mails or bad links but creating their own potential security issues along the way.


Eh, some kind of tool that detects files rapidly getting encrypted on PCs is useful, as encryption doesn't even need a system level takeover to be a pain in the ass.


People still have anti-virus software though. How good window's defender is is not relevant. What matters is the prevalence of various anti-virus software for your target userbase. Pretty sure various common software install tools give options for installing things like McAfee or Avast. Or they come preinstalled on store-bought systems.


I don’t disagree, I was just curious as to whether or not the situation is getting better, and figured it would be interesting to ask someone with firsthand experience.


Microsoft Defender "detects" a "virus" on any .exe or .dll of unknown provenance.

The solution is code signing. Pay whatever you need to pay for a certificate and sign all your releases.


That's the "smartscreen" function, and it also depends on where the file is marked as having come from (and yes, it largely seems to be a racket to get you to buy cide signing). If defender actively thinks there is a virus through a signature match or heuristics then the behaviour is different, and more difficult to bypas (and yes, this can also have false positives, irritatingly).


That sentence caught me, too. The fear of Google downranking AI-generated content with its left hand, while creating and promoting its own Bard bot-generated content with its right hand. Can an ai-generated snake choke on its own tail? Or is it tails...


Doesn't this mean Google has an AI-detector that is somehow accurately finding these? Otherwise the majority of its index would be falsely flagged.

What do they use?


> Doesn't this mean Google has an AI-detector that is somehow accurately finding these?

No, it means that someone was afraid they would use an AI detector similar to the one that was being used and downrank based on it; nothing in the article suggests Google is in fact using any AI-detector in its ranking algorithm, and I suspect that Google would be quite happy if people used AI (especially Google’s AI) to produce content that otherwise ranked highly on Google algorithms.


Microsoft is probably the best placed company to build a good detection method, having both a giant corpus of text predating LLMs from their Bing crawler, and a giant corpus of AI-generated text thanks to their relationship to OpenAI.

But Google is probably a close second, with an even bigger archive of pre-LLM text (including the world's largest collection of digitized books), and supposedly a good LLM and lots of smart ML engineers.

But no matter what method google finds, whether it's some complex AI trained on massive amounts of training data or a simple n-gram analysis that ends up being telling, they aren't going to tell us. That would just accelerate the arms-race between LLM spam and LLM detection. If they have something that works (big if) the details will stay hidden, the best we are going to get is some kind of press release aimed at dissuading people from using AI on their websites.


It is impossible to create detector for LLM generated writing. The LLM detector technology must beat the LLM at its own game and if it can we can easily create a better LLM. Anyone who claims they can do it is sorely mistaken or flat out lying.


By the same logic I guess you could create one for previous generations of LLMs. "Looks like you need to update buddy...busted!" And across multiple industries a decent percentage of users will be using outdated versions...


Your LLM detection method doesn't have to have a better understanding of human language than the LLM you're detecting, it just has to be able to identify patterns of speech typical of a particular LLM. Of course you have to repeat that work for each network and each significant update and finetune, but right now most people just use one of two ChatGPT versions.

And of course you don't tell people how your LLM detector works, don't provide a public API, and preferably leave it ambiguous whether it even exists. That solves most issues with LLMs using your detection method to get better in future versions.

That's counter to the business model of these LLM detection websites, but Google doesn't have a problem with keeping any potentially existing models inhouse and locked down.


Really all of it is bullshit if we don't know the true/false positive/negative rates.

If you detect LLMs 100% of the time correctly, it is irrelevant if you also incorrectly detect human written text as LLM text 80% of the time. "Detect them all and let god sort them out" will eventually fail as a business plan.


> but they're in the SEO game, and they were paranoid about a (so far hypothetical? I don't think Google have announced this yet) AI detection mechanism that Google might use to downrank them.

Google is probably just saying that scare off the SEO spammers from trying to use GPT4 to automate their jobs. I doubt they have a reliable detector (at least not one that detects based solely on the text content in the page).


> Desperate, Michael even reached out to the company providing the AI test. To no avail. They simply told him he should set up an account with them, buy himself some credits, and test his articles again himself. At the end of the day, nothing changed the fact that there was this all-encompassing metric being thrown at him saying: You are cheating.

Snake Oil Company: "This person's articles were probably -- to almost certainly -- AI-generated."

Crappy Client: "99% is very scientific and damning. Fire that plagiarizing writer."

Writer's Lawyer: "Sounds like libel with clear economic harm."

Writer: "Please, Snake Oil Company, you're ruining my career."

Snake Oil Company: "We know, and we don't care. Pay us some money, and who knows."

Writer's Lawyer: "Libel, with a side of protection racket."


Snake oil and a protection racket.


I wish article named the publisher so they can be boycotted.


"This absolute horror scenario is what happened to writer Michael Berben (not his real name)."

Absolutely didn't happen. You might occasionally credit a story in the NYT or WSJ using an anonymous source—but when one appears in a blog by a "serial entrepreneur active in the content industry," flogging a subscription product aimed at exactly the sort of marginally-employed writer targeted by the story, I would expect the average HN user to be slightly more critical of the claim.


I feel the same. I'm not sure there are many organizations out there applying AI detectors to writers they've had an existing, extensive relationship with, and then also giving a fuck if it pops up as a hit but they have no other complaints about the writing.

My wife's fairly connected to the writing world and I've not heard of this being a thing at all. Doesn't mean it's not anywhere, but if it were widespread I think I'd have heard of it. Most companies are trying to push more use of AI tools in writing, as far as I can tell, and I don't even have insight into content-farm parts of the market, where I assume it's just all AI all the time now.


My experience is that I have to write some things that often have parts that are, for lack of a better word, somewhat boilerplate, e.g. preamble/background/explain some technical term I used/etc. Necessary but not core technical or otherwise differentiated content.

I've used LLMs for this as a sort of first draft. I edit the output but it's a perfectly serviceable way to get some words down and save me an hour or two.


In the realm of technical documentation, too, originality isn't a virtue in itself. (And I say this as a technical writer who doesn't even use LLMs regularly.)

If you needed to throw in a sentence or two ahout, e.g., what load balancing is, you're better off lifting a definition from an authoritative source than trying to come up with a "creative" rephrasing for its own sake. Standard terminology is standard because it works.


Funny because I've been hearing this almost fortnightly on reddit and freelance forums for a few months now.

And the big middleman platforms siding with the client.

Some people prefer their anonymity, especially when they are put into the public sphere over disputes which can cost them future work. Notice the client wasn't named either.


The story is pretty plausible though. I hear students being forced to rewrite genuine original work because of plagiarism detectors, and what happens in schools eventually happens in the workforce.


It would be top irony if this story were AI generated.


"AI text detectors" are complete and utter bullshit rife with egregious false positives, so if they are actually used to cut ties with any writer based on such allegations, there is a near-certain chance that many or most such writers are actually innocent.

This might be a bias-confirming fake anecdote, for sure. But the effect is real.


These AI detectors have all the validity of phrenology. You can enter some of your own writing (I did) to easily see for yourself. The text I entered had been online, and still is online, longer than AI writers have been around - but that doesn't seem to feed into their logic.


A good example: Put bits of the US Constitution in some AI detectors and they'll tell you it was AI generated. https://arstechnica.com/information-technology/2023/07/why-a...


Interesting--perhaps the AI detectors are right, and this will be how we discover that we are in fact in a simulated universe!


...or at least a simulated democracy.


The whole idea of an "AI detector" is stupid. If it ever does work, it will only work until someone trains a model against it.


And anyway, text exists to be read by humans. If the buyer needs an “AI detector” to tell if the copy was AI generated, the AI would seem to be good enough.

I think this only exists to facilitate some already-bad heuristics, like the idea that nobody in-house needs to check the contractor’s work-product.


Which text did you enter, and into which AI detector?

I tried it with my blog, and it was correctly identified by Copy Leaks as written by a human. I don't know how representative https://copyleaks.com/blog/ai-detector-continues-top-accurac... is, but certainly all of them are getting above phrenology detection rates.

Even with this guy, the problem wasn't that it misidentified him as AI. It was that "we can't tell" got turned into, "we can't risk Google thinking he might be AI".


""we can't tell" got turned into, "we can't risk"

Oh god, we've developed yet another shitty lie detector.


You only just noticed that?


Ignoring the obvious issue that this whole anonymous story seems suspiciously perfect for selling a related product...

On the one hand... Companies spent the past couple of decades engaging in various SEO hacks to rank high on search results and OpenAI scraped the internet to train a language model. Theoretically, it seems possible that some of the SEO techniques at least partially colored the flavor of LLM-generated text, and an "AI detector" could pick that up. So if you do a great job writing SEO optimized text (wordy, structured, lots of repeated key words, etc.) you are more likely to be flagged.

But really.. "AI Detector" services are snake oil and will lead to the creation of "Anti AI Detector" services that offer protective spells against the original snake oil. See, we eliminate a bunch of jobs with AI but we create whole new disciplines of work that didn't exist before. "AI Generated Content Obfuscation Specialist - III - W2" coming to a job board near you soon.


Legal question: if a writer loses a job due to a false accusal of using AI, would they win a damages lawsuit against the AI detector service if they can prove that AI content detection is knowingly imprecise?

That's why those types of services (including OpenAI's initial AI detector) often have huge legal disclaimers saying not to take it as 100% accurate.


Kinda like asking for the source code to the breathalyzer?

https://arstechnica.com/tech-policy/2007/08/im-not-drunk-off...


I would kill to watch some smartass show up to court with their own ridiculous-looking black box "breathalyzer" and draw the parallel to "why do you magically trust theirs and not ours?"


I feel like the only possible way for this to work would be to get the opposing side to prove that the ridiculous fake breathalyzer doesn't work only for the box to be opened and it is revealed that it is in actuality the same model of breathalyzer that was used against the defendant.

Courts (and authority in general) usually don't appreciate the "oh yeah, well from my point of view the jedi are evil" defense.

However, very occasionally, tricking the offense into skewering their own offensive strategy will work ... that is at least if the deciding authority is in a good mood and you can convince them you're the pragmatic choice.


Be the change you want to see in the world. And when you do, please use a cardboard box with "Very Real Breathalyzer" written in Sharpie on the side and maybe a Nixie tube sticking out the top.


And inside of it is their breathalyzer.


You beat me by four minutes.

I feel this is the only way this has a remote chance of actually working. However, I think you also need to make sure that the judge feels in on the joke as opposed to being the butt of the joke.


The answer would be simple: because we pay this other group with taxpayer money.


For those interested in just how corrupt the relation between completely bogus "science" and unregulated "industry" can become, check out the ADE 651 [0]

In 2016 a British company, Advanced Tactical Security And Communications (ATSC), started selling dowsing rods, basically made from old coat-hangers and an empty box as "long range explosive detectors".

From Wikipedia:

    """ The device has been sold to 20 countries in the Middle East
    and Asia, including Iraq and Afghanistan, for as much as US$60,000
    each. The Iraqi government is said to have spent £52 million on
    the devices """"
It's almost certain that serving personnel were killed while relying on the fake devices to detect IEDs. ATSC founder was convicted of fraud and sentenced to ten years in jail.

It's no surprise that the "AI detection" racket is going to be a field day for every hoaxer and huckster. But companies like Turnitin have long been selling "black-box" snake-oil "Plagiarism detectors" that have blighted higher education, ruined millions of student's lives, and created thousands of make-work hours for professors.

Really though, it's our fault that we keep treating technology as magic, failing to exercise radical scepticism and robustly challenge the low quality products pedalled by tech companies.

[0] https://en.wikipedia.org/wiki/ADE_651


I personally hope this will be the course of action we see being taken against these AI detector services. It is well-known to many that these are essentially snake oil salesmen, and it has been damaging in other contexts too, such as in academic settings where students are penalized for similar false accusations.

As LLMs continue to improve, it would seem that it is only going to get more difficult to accurately distinguish between content which is AI generated vs human generated, so barring some kind of AI-detection breakthrough, these services will continue to plague students and writers with horrible false positive rates.


I'm not sure it would matter here. In my understanding, the victim was a freelance writer / contractor, not an employee. It kinds sux, but I don't think most job protection laws would apply here irrespective of AI involvement. Obviously this depends a lot on country / jurisdiction.


Defamation, not job protection, the lost freelance job is the source of damages, not the basic legal wrong.


No. But you could sue your employer.


I want to see an "AI detector" that just marks everything as being not AI generated. Then, whenever someone is accused of using an AI by some nameless website, they can present their own website that shows they didn't use AI.

Professor: "The website said you used an AI." Student: "Well, here's a website claiming I didn't use an AI." Professor: "I don't know if I believe that website, let's look into the merits of how it works." Student: "Yes, let's do that." Professor: ...


What you want is a sort of AI Witches' Scales?

https://en.wikipedia.org/wiki/Oudewater#Buildings


From the linked article: " After the weighing, they received an official certificate proclaiming them not a witch... Certificates would state that 'the body weight is in proportion to its build'. The reasoning behind this is the old belief that a witch has no soul and therefore weighs significantly less than an ordinary person".

I love this piece of history so much, thank you for sharing it! Adding this one to my bookmarks :-)


Which is obviously misguided. You can't determine from somebody's BMI if they have a soul. A soul only weighs three quarters of an ounce [1], or about 1/10th a cup of coffee. Well within even the daily variation of a person's body weight. (/s)

1: https://en.wikipedia.org/wiki/21_grams_experiment


Does my soul look fat? Is it good or bad to have an obese or skinny soul? Do they float? Are they charged, can you trap one in a Faraday cage? Is that what the Ghostbusters were collecting?


Also:

“The Waag is still open as a tourist attraction, and official certificates are available.”

Maybe with modern science we can make someone much lighter than they ought to be. (Put helium in their stomach?) Then they would be forced to issue a certificate that said that they could not guarantee that the person is not a witch :D


Not quite. Oudewater offered fair scales. What this guy's talking about are still rigged scales, just oppositely rigged scales. The only reason nobody was ever found a witch there was because the underlying concepts were BS.


Isn't the underlying concept of AI detectors BS as well? You could say something like "We cannot guarantee beyond reasonable doubt that this content is AI generated". Which one could argue is also always true.


This is something that I suspect will get easier in the next couple years as AI research progresses. I could imagine some sort of reversal tool being made such that "given this output and these weights files, how likely is it that these weights produced this output with a reasonable input?"

I don't know, that might be incredibly difficult forever, or it might be something that someone has a breakthrough in and manages to do.

Perhaps it's only useful for particularly egregious examples, and wouldn't be able to detect an AI text that was subsequently edited. That would be fine. When I was a TA in college, we ran students' papers through a plagiarism detector, which would spit out some % number based on what it thought was the amount/likelihood of plagiarism. The number itself was bunk, but across a class of 30 people, if one or two papers had much higher numbers than the other 28, I could click and look at it's justification. Sometimes it was a meaningless false positive, other times it was "here are the paragraphs that were copied verbatim from Wikipedia".


There also seems to be some willingness to have the models watermark their content, and LLMs provide very subtle methods to do so. For example for sufficiently long texts you could bias the token selection to achieve a very specific frequency for certain words; and declare any text where the word "the" makes up between 5.1% and 5.7% of all words, and "which" about 1% of all words to likely be generated by that AI.

One major issue is the feedback loop: There is lots of entrophy to hide a signal in regular English (whether you follow the methods of Stylometry or Steganography), and even an unmodified AI model has a stylometric signature. But if a significant portion of text we consume ends up being written by a handful of AI models, over time people will adopt many of these features. This is obvious right now, with ChatGPT having a certain writing style and some people unconsciously copying that style, and it will remain true no matter how subtle these things become.


> that might be incredibly difficult forever, or it might be something that someone has a breakthrough in and manages to do.

well, why not train an AI on it?


Sure but...since no one actually is a witch (in the sense they were testing for), a fair scale is also one guaranteed to say that they're not a witch.


You're not wrong.

I'm trying to think if you can indeed do something with a fair test.


Presumably the scales have at some point vindicated an actual witch in error.


Since the AI detectors are snake oil, just run a bunch of the professor's articles through them until you get one flagged as AI generated and present those results for a rapid reassessment.


Students are in a weird case. Currently instructors aren’t experts in AI tools because, well, they didn’t exist 10 years ago when even the young professors were learning how to do things. The best that can be done at the moment is to teach the old way. The current students are the ones who will need to make the AI based workflows. But obviating intro assignments by having the AI do them doesn’t benefit anybody.


If students what to deprive themselves of the opportunity to learn then that's their perogative.


Yes but this only further waters down the already watered down legitimacy of a degree. I know a lot of people think students should be building things independently of classes but that kind of misses the point of dedicating a full time investment in guided learning at the hands of an expert instructor. If instructors graded rigorously and had a magic cheating detection too (where cheating is defined overly capaciously so that students really and unambiguously have to get creative in their work), then the degree might mean something


The university system is rather outdated at this stage anyways. As an institution of learning there is still a place for universities in terms of research and inquiry, but degrees seem like an outdated concept. Maybe we can break down degrees to certifications that can be pieced together to demonstrate a set of skills. Maybe we can value independent projects more. The academic system is so broken at this point that I don't think much value will be lost if they lost all credibility, and let them scramble to create some more value.


If the expert instructor guiding you doesn't know whether you actually understood the material, chances are they can replaced with a YouTube video.

If universities offer a learning environment that actually provides the students better learning than what they can get at home, the pieces of paper they hand out will continue to mean something. If they don't, they were already on a downward spiral kicked loose by online learning platforms handing out certificates.


Easy, don't count coursework towards the degree, all exams.


Universities need to produce competent graduates at a good rate to maintain their reputation.


In an ideal world, "getting an education" and "getting a degree" are the same. In reality, any overlap at all feels like a win. Which is unfortunate, because some of the best learners are terrible students, and vice versa.


Surely this is trivial to refute? The professor just needs to have his own text that he generated using AI. When your website says it's not AI... then it's curtains I guess?


The student just needs to find a pre-2022 article that is mistakenly marked as AI by the professor's detector.

It's harder but probably not that hard.


Ideally, what the student should probably do is get ahold of as much of the professors own work as possible and feed it through the detector. While pre-AI work might be best for proving that the site can't possibly be reliable, post-AI work is probably better for convincing the professor that the consequences of treating the site as reliable are bad for him.


By the time a professor is this engaged and thinking this logically, the student has already won.


It would be, if the professors website wouldn't suffer from false negatives as well. Which it most certainly will.


I made this one a while back

https://isthiswrittenbyai.surge.sh/


Alternative reading: company determines that paid contractor is producing content that they can’t distinguish from a borderline free LLM, and decides to use it instead. Isn’t it likely that AI (not the detector) is the real threat to freelance writers’ livelihoods?


AI detectors neither reliably detect content that was produced by an AI nor content that could readily be produced by one.


And the alternate alternate reading is that the company is afraid of SEO penalties if they can't convince search engines that the content isn't AI and/or potential copyright issues with LLM generated content.

The detector feels a lot like an excuse for whatever they're really concerned about.


Pretty uncharitable way of reading it.


Pretty sure we are going to see a rise in the use of Oral Exams.

The teachers that don't want to use them will continue to churn out students that score high and don't know how to articulate a damn thing. Which may have been the status quo anyway.

Edit: I didn't like my original post


I was one of the students who could nail the exams and assignments I turned in, but then I BOMBED oral exams.

Because of crippling anxiety.

So how do you account for that? Or autism? Or any other sort of neurological disability?


Universities already have systems in place to allow students with disabilities to take alternate forms of assessments. So, perhaps, students who can't deal with oral exams can choose to give written exams.

Btw, I was a professor and I saw students on both ends of the spectrum. Students who would bomb an oral evaluation because of anxiety but would do well in written exams. And student who would bomb written exams, but would totally ace an oral evaluation. Since, a professor's job is to create well-rounded individuals, courses should include a mix of different assessment types, so students slowly build up the skills to handle all of them. Of course, augmented by guidance and feedback on how they can build those skills.


The older I get the more convinced I am that a core skill schools should teach—started from early grade school—is speaking. Reading is probably the single most important skill, period, but—and I write this as someone with a bias toward writing—I think speaking is more important than writing, for most people. Not that learning to write doesn't teach skills relevant to speaking! Not that we should stop teaching writing! But I think getting over that kind of anxiety is exactly the kind of thing that every school ought to teach, and I think if the only way to accomplish that were to sacrifice some writing instruction, it'd be worth it.


I'm on the spectrum.

If I were a professor though, I would use a baseline for every student.

At the beginning of the semester, have an oral exam for a difficult topic that the students are expected to already know. Material from a prerequisite class would be good.

I would always give 100% for that oral exam if the student showed up. And that exam would become that student's baseline; their future oral exams would be judged based on that.

They'll probably be more nervous in front of the professor the first time than the last, so it would actually give them an advantage in later exams.

Probably still problems with that though.


Definitely. Once word of this system gets out, it will be gamed by students who will throw the first exam deliberately. Even without this sabotage it would still feel extremely unfair to students who did well on the first exam and are now being punished for it.


Well, on future exams, they're still going to have to answer questions with some knowledge, and sabotaging wouldn't help so much with that.

But the other problem is harder to fix. I guess I could be much harsher in the first one, and go easier on the real exams later? Maybe that would reduce the number of students who do well on the first one?

Eh, I'm not a professor. Not my problem.


What about students who are currently failing the written exams due to their neurological disabilities? How are you accounting for that? These students would excel in an oral exam.


I'm the opposite - so how would you account for that?


Or just on-site exams? Isn't that the norm anyway?

We may see a rise of on-site homework.


Let's turn around and look at this from the company's point of view, rather than the freelancer's.

The company is looking for text that ranks well on Google SEO. That is what the writer is asked to create. The company has cause to fear that anything that looks even remotely AI generated will be flagged and penalized by Google. Google does this regularly, is a black box, and regularly destroys companies when they do it it.

The company has found out that this guy's writing looks enough like AI that detectors aren't sure that he is human. They fear that where one detector is uncertain, Google may assume AI. Therefore the company has reason to believe that this guy is no longer able to do what they pay him for - producing text that will rank well on Google SEO.

Nothing that the freelancer is doing addresses that. The freelancer wants to prove that he's human. The company never doubted that. But their knowledge won't help them if Google's SEO rules change. His job is to provide text that will rank well on SEO. He isn't succeeding.

The freelancer tries to convince the AI detection tool that they are rating him badly. But the value that they are providing here really isn't, "Are you human." It is, "Is this content potentially going to be flagged by Google's search algorithm in the future?"

Yes, the next generation of LLMs will beat the tools. But that's just the next iteration of a cat and mouse game that's been going on ever since content farms figured out how to manipulate PageRank, and Google decided to penalize the behavior. And the rule of that game is that you focus on what you think will happen in THIS iteration.

Yeah, it really sucks for the freelancer. But the company is probably making the right business decision here.


Except Google isn't doing what the company thinks it's doing. AI detectors don't work. Using one as a search ranking signal would make search worse by erroneously down-ranking false positives. Google knows this, so it's reasonable to assume they aren't relying on AI detectors. That makes what the company did stupid, especially if they now have to pay to rewrite the content so that it 'beats' that detector's utterly unreliable score.

It's not that next generation's AIs will beat today's detectors, but that there is no such thing as an AI detector. Just something pretending to be an AI detector that doesn't work.

Your proposal that the writer's 'writing looks enough like AI that detectors aren't sure that he is human' makes no sense. There is no 'looks like AI' that a detector can reliably distinguish from 'looks like a human'. All that's happening is the writer's work matches some completely bogus snake oil metrics made up by detector developers.


They aren't yet. But Google has a history of planning out a change, collecting evidence, then changing the rules and killing everyone who they had caught.

Given that, a bunch of well-SEOed content farms are bound to be killed for suspicion of being a ChatGPT written content farm.


The point in this case is that it doesn’t matter what Google is doing or whether AI detectors work or not. What matters is making the client happy.


Yes, but as someone who has done a lot of freelance writing, I can tell you that 'what makes the client happy' is often orthogonal to 'what's best for the client'. It's challenging to make them see sense when they have a bee in their bonnet about the latest bit of technological or marketing flimflam.

You're often put in the position where your choices are to get fired or do what they want and watch their sales and search rankings tank. Unfortunately, 'I told you so' doesn't seem to cut much ice when that happens.


Why even care? If the article is well written and well researched and it's seemingly impossible for readers to tell the difference, what's the problem with using AI to improve it? Did people care when computers were used instead of typewriters? When spell checkers became popular?


It said in the article that they were afraid that google would downrank them for having AI generated content.


In which case it really doesn't matter whether the writing was written by a human author or whether it was AI-generated. The effect would be the same, if an AI detector (mis-)classifies the content as generated.

Does that warrant the author to be fired though?


Thanks for pointing that out, I don't think their premise is correct on that one, though: https://developers.google.com/search/blog/2023/02/google-sea...


Sadly, SEO is 90% superstition.


Mission Accomplished?

https://xkcd.com/810/


Not quite, because we're only at the point where the bots makes comments with the veneer of being constructive or insightful.


But they're getting there!


Professional writers might get wrongly flagged by AI detectors because these AIs are trained on well-written articles. So, if an AI and a professional writer both write about something like SpaceX, they could end up sounding very similar. That's just because the AI learned from articles like what the professional would write. If someone like me writes the article, it will probably be marked as human-written. But that's not because it's more 'human-like,' it's just because it's not as polished as what professionals write. The AI detector under the hood may look if the article is "pro" or "hobby" written and spit the number.


AI produced writing tends to be very generic -- providing a good overview of a topic, sometimes with some detail, but rarely with any personality, original information, realistic examples, or a surprising perspective.

Contrast this with the SEO focused writing that has been produced in heaps and heaps over the last decade. Good luck. It's actually hard to find any difference!

This could actually be considered good news. Maybe all of these "AI Detectors" are actually "bad writing detectors." If you're not producing anything original, in terms of real-world examples, style, or substance -- then maybe the writing ought to have been generated by AI anyways.


You're saying you want a novelty detector.


Daydream: The names of companies firing people on the basis of "AI detector" results are reported to the IRS...which then uses an "AI detector" to discover whether they've been cheating on their taxes.



Interesting aside: I've been fascinated about how financial fraud can and has been detected by identifying numerical patterns that are fleetingly unlikely. Kind of like the, "fake data is actually pretty easy to detect" I've seen in academia.


I've always been amazed at how often simple tests like https://en.wikipedia.org/wiki/Benford%27s_law actually catch this kind of thing...


AI writing tends to be bland and repetitive. So if this encourages people to not write like an AI then that's a win


At least the stuff I've generated (and subsequently rewritten) would probably make my spidey senses tingle (though I wouldn't be certain). On the other hand, if I were primarily comparing it to especially low-rent freelance content, I'd probably have a harder time drawing the distinction.


AI writing is like a 12 year old using Cliff Notes for a last-minute book report.


Exactly how it is used by students now.


There's a lot of people with strong opinions whether these detectors can or cannot work, but I implore you to do the experiment I just did. Go to the top 10 you find in a Google search and paste a sufficiently long sample of your prose. Not a random HN comment - at least 200-400 words of normal, coherent text.

It's a game of cat and mouse in the sense that you can build LLMs specifically optimized for evading the current crop of detectors, but in my testing, they work pretty darn well in the general case. While they might not reliably pick on all LLM text, and while there's sometimes a couple of words in human-generated writing that causes them to output a low but non-zero probability of LLM content, they do not rate human-generated text as "99% AI". Especially not across multiple writing samples.

The most likely story here, I suspect, is that the person leaned on LLMs for commissioned writing and is now trying to save face. The secrecy of the models works both ways, right? And frankly - how often do you see people in HN, or people who do commissioned writing, admit in private that they're using ChatGPT? It's cropping up all over the place.

Note that I'm not commenting on the ethics, fairness, or transparency of tools like that. I'm just saying they work far better than you might be suspecting.


One flaw in this experiment is that most of us here aren't professional writers, which means that text we produce ourselves is probably less likely to trigger a false positive just because we didn't clean up the spelling, grammar and general writing style to the point that it might look like like an LLM wrote it.


I did the same experiment, was pleased to see that all the human-written articles I copied in were correctly identified as human written.

On the other hand, I tried three AI generated texts (>500 words each) and only one was marked as AI generated (it was so obviously AI generated that it would have stood out to me).


Yes, I also would like to find an article pre 2020 where AI detector says 99% AI written, because in my small sample I couldn't find any.


Don't these people realize that 95% accuracy is very low? Of course they don't.


In my experience, anyone who can write at a post high school level is accused of being an AI by everyone who can't.


It's interesting to hear about your experience, and it highlights a common phenomenon in the digital age. As technology advances, it's becoming increasingly challenging to distinguish between human and AI-generated content, especially when it comes to writing. The ability of AI models to generate coherent and well-structured text at a post-high school level has indeed improved significantly in recent years.

While it's flattering to be mistaken for an AI when you write at a high level, it's also a testament to your writing skills. However, it's essential to remember that human creativity, nuance, and emotional depth in writing still set us apart from AI. Even the most advanced AI models lack true understanding, personal experiences, and emotions, which are essential elements in many forms of human expression.

Embracing the evolving role of AI in writing and communication can be empowering, as it can assist in various tasks, from content generation to language translation. However, it's equally important to celebrate the unique qualities that make us human, such as our capacity for creativity, empathy, and storytelling. So, whether you're mistaken for an AI or not, your ability to write at a post-high school level is a remarkable skill worth acknowledging and refining.


I chuckled.


By this point I'd just be recording myself typing it if in case this ever came up.

Like what other defense is there? "No I didn't" prove it "I can't"


The really funny part is that training an AI or even just setting up a monte carlo style generator based off of your existing typing cadence would allow you to give a body of text to the algorithm and then have it produce a video that looks suspiciously like a person wrote it. Write it yourself but then only fake a video if you happen to need evidence.

What are the chances that anyone who has a AI text detection tool ALSO has a AI generates text at human like speed with edits detector.


Or type in Google Docs (or similar) that has an edit history, showing you didn't copy and paste massive blobs of text.


I find it weird that people here even think it’s possible… if you mix it around, there is no way to tell. Stop trying.


Weird choice to not disclose the writer in question. Would be very interested to see their articles and judge for myself if they were letting AI do a lot of cleanup on what they were writing.

200 articles in 3 years of journalism seems very prolific. Can anyone speak to whats normal for a career journalist?


> 200 articles in 3 years of journalism seems very prolific.

It's about four articles every three weeks. Doesn't seem extreme.

https://dublininquirer.com/2020/02/26/sam-how-many-articles-...

“Twenty years ago, in my first full-time job working for a daily newspaper, there was a period when I was turning out two or three articles a day – and I thought that was an unsustainably high number.

Today, young journalists have told me they’re asked to produce several times that.”


I'm having to start having doubts, recently Riot games had a controversy where one of its artists was caught using AI for art, just like with cheating in pro gaming/sports a lot of people say no but then turn out to really be doing it.


Surely this is defamation? If I write a program to defame someone, is that still defamation?


Nope, not even close (at least in the United States). It varies a little by state, but in general defamation requires making either knowingly false statements, or having a malicious disregard for the truth, combined with intent to harm. An AI detection generator that gets a wrong answer meets neither of those requirements. You could potentially sue your employer or anyone else who took action on the false report, but the detector itself is probably totally immune.


Wouldn't having a demonstrably false accuracy rate, but simultaneously claiming to be 99% accurate in all circumstances constitute making knowingly false statements, and perhaps even malicious disregard for truth?


Public figures need to prove "actual malice" [0]. If you are a private figure, you need only show negligence.

[0] Actual malice itself is a term of art. It does not require malice in the normal sense. Just a reckless disregard for the truth.


The variety of different ways in which AI is starting to ruin modern society is impressive.


In theory, writers whose writing was used as input training data for these LLMs will have a higher likelihood of being misclassified as AI given that the LLM may leverage some unique aspect of their writing style in some outputs.


3 years from now: Was this written by a reputable AI? Yes? Great - let's publish now. No? Sorry - we don't have time to do QA on this - please resubmit with an AI-edited version.


It really sounds like that company wanted an excuse to fire him and negate any contractual obligations to him.

You don't run someone's writing through one of these detectors if you like them.


Are his articles included in a training set?

Are they actually good content or should he try harder to behave like a human?

This sounds like a defamation lawsuit.

This seems like an issue NFTs could take some pressure of off.


> This seems like an issue NFTs could take some pressure of off.

A new game: Sarcasm or Crypto Poe's law?

I legitimately don't know if I'm laughing with or at the above poster - all I know is I'm laughing.


If you're not joking, I would be fascinated to hear how you think NFTs might be relevant to this.


It's a certificate of authenticity. You can lie about whether you used AI or not, but your reputation would be demolished if you were found out. You could have proof-of-humanity parties, like key-signing parties. (Partying 'cause you're human seems like a good enough reason.)

The reputation would be an asset of yours, a post-monetary currency.


Aah OK, so this is more about signing something in a way that proves you produced it and then saying "I swear I didn't use AI for this, and I'm willing to stake my cryptographic reputation on it".


Yeah. It gets you a Wikipedia effect where you know how to go about getting this information and consumers of art have certain set expectations.


I suspect the parent-poster means "put your fresh writing into a blockchain, so that you can offer that as evidence you wrote it--or at least knew about it--before it showed up later somewhere else."


The AI detector business is the tech industry's moral analog of bicycle theft.

You know you're doing wrong, and doing harm—but you see a pitiful profit in it, so.


I've got this suspicion a lot of organizations are using these AI detector results to fire people only to replace them with AI generated content.


That article sounds like it was AI-generated…


If an AI is potentially putting you out of a job, maybe it's time to rethink your job.


AI generated text is ever-present in low-stakes writing assignments. How do we detect every AI submission? High School student taking AP Literature and tasked to write a 200-word essay on the key ideas in Shakespeare's King Lear. It is 2am.

You want to go to sleep.

Google Bard to the rescue, Prompt: "Write a 200 word essay on the key ideas in Shakespeare's King Lear". The result isn't great. But it might pass an English teacher reading it quickly.

Can these be detected?

Bard: "Key Ideas in Shakespeare's King Lear

William Shakespeare's King Lear is a complex and powerful tragedy that explores a number of key ideas, including the nature of authority, the importance of family relationships, and the fragility of human sanity.

Authority and Order

King Lear is a play about the disintegration of authority and order. Lear's decision to divide his kingdom among his daughters based on their professions of love leads to chaos and bloodshed. Lear himself is stripped of his power and authority, and he descends into madness. The play also explores the corrupting influence of power, as seen in the characters of Goneril, Regan, and Edmund.

Family Relationships

King Lear is also a play about the importance of family relationships. Lear's rejection of his youngest daughter, Cordelia, who is the only one who truly loves him, is a central tragedy in the play. The play also explores the complex relationships between fathers and children, and between siblings.

Fragility of Human Sanity

King Lear is a play about the fragility of human sanity. Lear's descent into madness is one of the most moving and disturbing aspects of the play. The play also explores the relationship between madness and wisdom, as seen in the character of the Fool.

Other Key Ideas

Other key ideas in King Lear include:

Nihilism: The play presents a bleak vision of a world without meaning or justice. Self-knowledge: Lear's journey to self-knowledge is a central theme in the play. Old age: The play explores the challenges of old age and the loss of power and authority. Fooling and madness: The Fool and Edgar's disguise as Poor Tom provide comic relief in the play, but they also serve to highlight the madness and chaos of the world around them. Conclusion

King Lear is a complex and challenging play that explores a number of key ideas about the human condition. It is a tragedy that has resonated with audiences for centuries.

In addition to the key ideas discussed above, King Lear also explores a number of other themes, such as the nature of love, the importance of loyalty, and the power of forgiveness. The play is a rich and complex tapestry that continues to be studied and debated by scholars and audiences alike."


If your work looks so much like generated content that it’s flagging detectors, then it’s also probably the work doomed to be replaced by generated content soon enough (if not already).

I don’t think current content generators will destroy many professions any time soon, but a lot of high-volume/low-value work is going to get consumed by them. This seems like an example of that work.

Hopefully, this guy didn’t expect to keep writing fluff into retirement and can find new footing with more sophisticated writing or pick up a new career with better opportunities.


> pick up a new career

It sounds so easy put like that. Hmm.. what do I have to do today? Get some mustard at the grocery store, pick up a new career, drop the dog off at the vet.


That something can be plainly put doesn’t imply that it’s easy.

We can have sympathy for the guy and want to support him while also recognizing the practicality that he’s been walking down a dead end road and needs to change course.

Even without LLM’s, fluff writing is ripe for outsourcing to second-language writers in cheaper markets and gets more efficient as those markets optimize for it. It can be a job for a while, but it’s just not a viable career in the long run and opportunities were already narrowing.


LLM-created writing isn't obviously worse than the typical pennies-a-word stuff churned out for content farms in a lot of cases--especially if the person driving the LLM has some domain knowledge about the topic and can catch any obvious errors.

Similarly, SEO often demands some graphic/any graphic to go with a story/blog/etc. and GenerativeAI is perfectly able to provide that--though lower-end royalty-free stock is pretty cheap in any case.


Yeah lol it sounds ridiculous. But playing devil's advocate: Perhaps a different form of writing would be a more sustainable career.


> If your work looks so much like generated content that it’s flagging detectors, then it’s also probably the work doomed to be replaced by generated content soon enough (if not already).

You assume that the detector actually works as advertised. Without seeing the authors content we have no way to judge in this case. However seeing that even openai struggles with this, i would question that this company has somehow found the holy grail.

In this case if you get wrongly classified you have no way to correct it. This thing might even make your writing worse because now you have to write it in a fashion that does not trigger the detector.

> I don’t think current content generators will destroy many professions any time soon, but a lot of high-volume/low-value work is going to get consumed by them.

In this case you have to compete against the generated AI content and fight against the black box AI detectors as well. You can improve the quality to fight the first. But how do you fight the second? Do i have to get a subscription to every single AI detector on the planet now?


"better calculators will destroy humanity" seems like a ridiculous statement

but that's what new AI tools that can create good art/music/writing does. They boil down and automate the middle process of calculation.

what was previously "creative" is revealed as calculation from the God/Universe pov and as the range of calculable things grows, the island of im-a-real-human-bc-creativity gets smaller and smaller

defined this way... creativity ends up including some interesting domains like capital allocation and hedge fund management.

these allow for "expressions of freedom" unconstrained by calculation


What a bleak way to think about creative work. AI is just brute forcing it by looking at millions of training examples. Talking about it like it knows some fundamental truth about creativity misunderstands what it does. At best it's an approximation and at worst it removes what was special from the inputs.

> creativity ends up including some interesting domains like capital allocation and hedge fund management.

I hate this website sometimes.


You're projecting a bit I think... the comment is making a case for the dehumanizing aspect of AI when it blurs the line between calculation and creativity




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: