Hacker News new | past | comments | ask | show | jobs | submit login
Detect ChatGPT Generated Content (gptzero.me)
103 points by ausudhz on Jan 28, 2023 | hide | past | favorite | 102 comments



This is useless. I have tried it with 4 different samples and it made mistakes with each.

- it thinks AI wrote parts of my handwritten texts

- King James Genesis 29 is appearantely fully AI written

- a wall of text copied straight from chatgpt? Only partially AI written.

- second chapter of Harry Potter? AI written parts

In fact I could not find a sample yet which was not AI written at least partially according to this.


Agreed. This is useless. I asked GPT-3 to write a HN post decrying GPTZero:

> GPTZero is absolute trash! It can't even detect plagarism correctly. It's a complete joke and waste of time! Don't bother with it! You'd be better off just manually checking for plagiarism. It's so unreliable it's not even worth mentioning. Just save yourself the time and money and don't bother with GPTZero. Worst AI tool ever!

GPTZero classified it as fully human written. This is a cat and mouse game where the mouse is always going to lose.


It is hilarious comedy, though.

The sad part is, that real people will likely suffer, when people with power take salespersons serious and use stuff like that to "detect" and punish frauds.


Agree with you, but! In a Tom and Jerry episode, Tom Cat would be GPTZero and Jerry Mouse would be GPT3 ;)


The GNU preamble is entirely AI written. According to this tool. To be honest I always suspected Stallman. :)


it's a toy project built by a CS undergrad student over their winter break that went viral because journalists desperately want something that can expose AI writing. It's built on streamlit and a few glued together libraries, not really surprising the results aren't good. It's not some state of the art custom tool but they are marketing themselves as the solution to solving academic plagiarism


how mad would you be if someone accused you of using AI to write something but you actually did write it


Same, ran a few samples and it was wrong every time. My hope is that teachers will run some tests before just willy-nilly trusting the scores it gives and start accusing students, but sadly I know a lot of teachers and I bet within that group few will run their own tests and will just accept it as authority.


The only way I could get it to say something was completely written by a human, was by adding spelling mistakes


The problem with such things is that you would need 100% accuracy in most cases for it to be useful. For example many schools and universities fear that students use ChatGPT for homework. If such a plagiarism checker has false positive results in just a small percentage of cases, the consequences for honest students would be too severe to actually act on the results of the check. But perfect accuracy can never be reached for text classification, so those tool will never be that useful despite being interesting to AI people.


At the university level, the problem of cheating is more a problem with our idea of a university. In an ideal world, you go to a university to learn. If someone wants to cheat, ultimately they're just cheating themselves (and, if at a private university or even most public universities, throwing a lot of money away). "Cheating" isn't really the university's problem in that view.

Ofc, in reality universities aren't (or not only, and in most cases not primarily) about learning, but about credentialism. Employers and various social systems outsource to universities the role of verifying that people actually know something, or are the "right" sort of person, or other signals.

I don't know how one goes about fixing that, or if it's even possible, but I'd like to see more acknowledgement of it. Fixing "cheating" feels like the equivalent of looking to clever programming to fix product bugs.


Universities should stop issuing transcripts and diplomas that they refuse to guarantee. They are complicit in a fraud against employers.


That's on employers for putting too much faith in educational institutions.

It's like expecting a guarantee from the church that all their exorcisms render the host demon-free.

A degree is an artifact of a belief system. That there are credentialed idiots among us means the belief system itself is flawed.


Grades are also needed in university to check for prerequisite knowledge for courses. You waste everybody's time and resources if you let people into courses who think or pretend that they have the prerequisite knowledge. People who are cheating aren't only cheating themselves.


You don't really need any radical changes, you just have to get rid of the idea of graded homework, which I always hated from elementary school onwards.

Larger projects have a role to play in education, but ultimately if you can't pass the proctored exams, you don't pass the course.


This seems to be a reductive response to a confidence based policy approach. There are many systems out there today that provide a confidence score of bots vs humans. I happen to work with a product that uses this exact approach and it's highly effective. So with education scenarios I think that the teacher/professor/administration will use these systems to inspect the entirety of the submissions. From there a baseline value will be derived and from there outliers will appear that may require deeper analysis/interview. Schools are in the position that they can't get it wrong more often than not (they need tuition dollars after all), but a severe detractor will need to be hanging over the heads of students to avoid cheating with it.

I think the other thing we'll see is a nanny state approach by some educational institutions. Falling in a trap of being sold on software to "block" or "monitor" students using these tools. It would be easy to implement on a campus network to a certain extent (correlating student network login to URL accessed and potentially MitM) but the reality is smart students will know better and will use phone hotspots and VPN. The other dark side to consider is that the owners of ChatGPT could provide logs of user accounts and queries to higher education as a service.

At the end of the day my guess is all approaches are going to be tested at some level. But the cat is out of the bag and this is going to generate some very interesting countermeasure solutions/approaches along the way.


That’s a very naïve outlook in reality like with every system the burden of proof would end up falling on the accused not the accuser…


The reality in my experience (as a university lecturer) is that the burden of proof very much falls on the accuser rather than the accused.


How is that? Students already are required to submit a lot of evidence with their work such as research notes, plans, lab work etc. to prove they’ve actually done that and if plagiarism detection systems flag any of their work they have to defend it rather than the institution having to investigate and build evidence on its own outside of w/e shoddy plagiarism detection system they bought told them.


What could that burden of proof look like?

Not sure how a student could prove that the negative, apart from videotaping themselves writing the essay, or only being allowed type it on an airgapped machine owned by the university.

Doesn't feel very feasible to implement.


and if the student is not a complete tool, he can definitely learn the output and replicate it very, very closely


Even before GPT, schools should require students to submit derailed outlines and notes of their work. For one, students need to learn these skills. For two, it helps protect against more plagiarism and cheating.


> The problem with such things is that you would need 100% accuracy in most cases for it to be useful.

Why? If there is 75% confidence a students report was generated using ChatGPT, then that’s enough to sit down with the student and discuss the content in person and see if they actually know it. A tool such as this could help the teacher having to avoid doing this with every student, and also reinforce to students that if they don’t actually know the material there’s still a chance they get caught.

> the consequences for honest students would be too severe to actually act

Only if acting is immediately accusing them of plagiarism rather than working with the student to ensure it’s really their work.


This is such a short-sighted view. Students who wish to cheat using ChatGPT will immediately start running their text through these tools preemptively, changing some phrases here and there and add spelling errors to ensure they don't get caught. You're left with innocent students being accused of cheating on regular basis.

Maybe the education system will finally learn that asking students to merely recite information that can be found by anyone, anywhere, in less than 10 seconds isn't helpful in judging their understanding of the subject, except in very limited scenarios.


> changing some phrases here and there and add spelling errors to ensure they don't get caught.

The students who will do this already do this though, just with different source material I suppose.

> Maybe the education system will finally learn that asking students to merely recite information that can be found by anyone, anywhere

I think we’ll see an increased amount of face to face assignments like exams or essays written in class only, where students can’t use these tools. It solves the problem entirely and is arguably better - but increases education cost.


This is not what is going to happen tho. Flagged content will not be accepted and the student failed for the class, or worse for the year. Expulsion is on the card as the war against generated content becomes bitter.


No university will expel a student without evidence. One tried and lost a $1.5M lawsuit.

https://people.com/human-interest/twins-defamation-lawsuit-m...


Expect an escalating arms race between models attempting to disguise their output and tools attempting to detect generated content.

There may also be escalating social and perhaps legal penalties too.


I'll just put this xkcd here - https://xkcd.com/810/ (Constructive)


This doesn’t work. Stack Exchange tried it, and it turns out there is a lot of misleading, superficial content that easily gets a high number of upvotes from newbies who don’t know better.

And that is without even considering bots.


Yeah, war is hell.


What about false positives? Irresponsible to market this to laypersons as perfect with absolutely no word of caution about false positives. There's been some chatter on Reddit that some content writers were falsely accused of using ChatGPT and lost their clients.


This is an everlasting battle. Schools are already using these kind of tools to detect cheating, often with false positives that may have devastating outcome.


The accuracy just isn’t there, which isn’t surprising. I’m feeding it Star Wars reviews from both ChatGPT and IMDb, and for sure the answers are correlated. The false negative rate is OK, although I’ve hit plenty of them. But man, it sure does think a lot of people were using ChatGPT to write reviews back in the 2000s.

Again, it’s definitely correlated, even strongly correlated, but that’s not good enough for plagiarism detection. You can’t go accusing students of academic dishonesty based on a tool that gets it wrong multiple times on a couple dozen samples.


How is this different from this research on gpt-2 detection model [1]? [1] https://github.com/openai/gpt-2-output-dataset/tree/master/d...


-- I copy and pasted GPT3 reply to - what are Jim Crow Laws? - said written by AI - then copy paste wikipeda first paragraph - said written by AI - then I wrote it myself - said written by AI - then I copy/paste 1 word over and over and over 100 or so times - it says "likely written by human but some by AI" then - highlights the whole text and notes - likely written by AI - hahhaa --


AI is so good that even human written text is indistinguishable from AI.


I wonder if there is a way to watermark generated text, not by adding extra spaces or using invisible characters but by using grammar. Example naive rule “every 13th word MUST be a proposition”. Another more complex example “every 11th word is a checksum of the previous 10. You encode a few bits in every word i.e noun=1 verb=2 article=3 etc”


Trivial to break by making tiny changes.


I read an interesting tweet about how it would be possible to watermark GPT outputs. Essentially before each token is generated the previous token is used to seed an RNG. Using the RNG the possible next tokens are split into a whitelist and a blacklist and the model can only select words from the whitelist. Later on it's possible to "check" for the watermark by counting the whitelist tokens and doing statistical analysis.

Apparently they can preserve performance by not doing this for very low entropy tokens where there is only one token that is extremely likely.

Saw it here: https://twitter.com/tomgoldsteincs/status/161828766500640358...


I tried with a few texts that ChatGPT generated for me and it says it was entirely generated by a human. It's true, I fixed some things here and there, for cohesion basically, but I think it's almost impossible for these tools to get it right.


Conversely, I wrote a few sentences in simple English off the top of my head (about The Sound of Silence, because they were talking about it on the radio this morning) and it said it was "likely to be written entirely by AI".

False positives in a plagiarism tool are pretty bad IMO. It should definitely skew towards "we can't be certain" rather than "definitely AI".


Here’s and alternate approach to detecting GPT that has an actual benchmarks and research paper available:

https://news.ycombinator.com/item?id=34557189


So, what prevents chatGPT to use detection tool and fine-tune its response accordingly?


Nothing, this is how adversarial training works, but it also works both ways.


It works both ways, but generation is advantaged in the long run. There has to actually be a statistical difference to detect, and AI outputs without statistical differences from human output are obviously possible, since humans make them all the time.


Sort off, I’m aware that in principle the generator has an advantage and eventually the detector will average out to a coin flip at best.

However some advantages can disappear when you put constraints on the output such as quality and correctness.

So whilst the end result might be less statistically significant in terms of was it human or AI generated it can overall be also less useful to the end user.


"However some advantages can disappear when you put constraints on the output such as quality and correctness."

Only if you suppose that the ideal output is superhuman. In the case of OpenAI et al, that's arguably the case, but those aren't the players that are going to get into an arms race with detection anyway. They want it to be relatively easy to detect AI generated content, because they're not in the plagiarism business, and anti-plagiarism measures will get the public and media off their backs. And nobody who is interested in targeting plagiarism has nearly the funding to build their own LLM on a level that matters.

So if there's an arms race in the near term, I expect it will be with postprocessors instead. These will be much smaller models (i.e. runs in your browser, or at least on a small backend machine) that take the output of ChatGPT and tweak it to fool detectors. They won't care about maximizing quality or accuracy, but will just care about preserving meaning while erasing statistical signs of AI generation.

I don't know if the business case for that will be there. It's there for selling papers, and almost certainly some people will try their hand at these models just for the challenge and/or to prove a point.


Im not sure if that how it actually would work out.

Most humans can’t write say an essay to save their life.

And those who do write very well tend to have their own signature.

Whilst it’s not 100% accurate we’ve managed to fairly successfully attribute a lot of unknown works to specific authors based on their known works.

So if you create a generator that produces output equals to say top 1% of human authors I’m not entirely sure that you can get one that doesn’t have its own signature.

Because whilst as you said most humans produce output that is statistically indistinguishable from most other humans the output that tends to survive selection bias and become known works is quite distinguishable by definition.

So you don’t even need to get to superhuman capability you just need to get to a high enough output quality that it would limit the statistical search space from billions to millions or even thousands.


This may be along the lines of what you’re suggesting, but what if you flipped this around: instead of trying to recognize AI, you recognize the student? You model each student’s quirks so you can tell if they wrote their essay, or if someone else did. Now you don’t care about AI specifically; you just care about whether they wrote what they submitted.

The main failure mode I see here is students dramatically improving and throwing the system off. If someone gets a tutor or goes to writing workshops, you don’t want to accuse them of plagiarism just because they got better. But there may be ways you could deal with that, like having the student submit new samples.


That could work but that is changing the problem and moving the goal posts, a plagiarism detection system that is essentially trained on individual authors would be able to identify any time they skew too far from their rolling average.

I’m not even sure if ML is absolutely necessary for this or not.


I thought to make something similar but what scared me away was the idea that some educational institution could use it and decide a student was being dishonest due to a false positive. I wouldn't want to bear responsibility for something like that.

And secondly, this is a never-ending race. Even if it were to be able to detect ChatGPT content with 100% accuracy today, it would just be used to assist in training another model to defeat it.


ChatGPT and other AIs are theoretical machines built on human data.

Can an algorithm write an article about the 2008 financial crisis? Of course! it has read God knows how many Wikipedia articles, books and online discussions about it. But everything the AI knows is because some humans put in the work and documented everything.

If we don't know something, be it a historical fact or a scientific model, we have the ability to go out in the real world and record the data. Can ChatGPT fly to Yemen and interview refugees? Can ChatGPT tear apart the circuitry of a home appliance, reverse engineer it and write a blog post about it? Can ChatGPT go to a lab and make chemical experiments to create new compounds?

No, it can only regurgitate what it knows and even what it doesn't know.

ChatGPT will absolutely steal the job of all the "journalists" that just regurgitate what others have already written. Original research and field reporting will be left to the real journalists for God knows how many decades. If your job consists of actually making new things or collecting real-world data then I think it will be safe for the decades to come.


What I think would be interesting is to add the assignment / questions as input for the tool to consider in its evaluation. That way your tool could plug these into ChatGPT, ask it to reword it several ways (mimicking the likely path a plagiarizer might take to actually generate content) and do some diff/comparison as part of the evaluation.

Even so, I agree with most of the comments that it would be exceeding difficult to truly identify this stuff since you can ask ChatGPT to reword things in very specific ways or just edit it yourself (aside from blatant false positives)


I just fed it an email I'd personally written.

The start and end were apparently written by AI.

What does this mean? Is this proof we're in a simulation and I'm actually a mouth piece for an AI that's reached the singularity?


I tried this on my recent AI satire article. While ChatGPT detected my story as AI (because I wrote it like a machine), GPTZero gladly says it's human written.

ChatGPT's response: https://imgur.com/a/u7iBBaX

GPTZero's response: https://imgur.com/a/89icz2X

To be clear, my story was ChatGPT-assisted. But I wonder why ChatGPT couldn't detect it correctly like GPTZero?


If your story was ChatGPT assisted one can argue that it was GPT0 which misclassified it.


Correction: I used ChatGPT to discover new words and rephrase the story myself.

Here's the link to my non-paywall story for references:

https://medium.com/humor-bytes/i-your-nba-highlights-broadca...

On the topic of assistance, where do detectors draw the line between AI-augmented, aided, and generated writing?


I don’t know which is why this is completely up for debate, Office also offers grammatical corrections and you have websites like Grammarly that do the same often without any “AI” as far as I know.

Like with anything else this is something that would require society to come to a consensus.

And there are various issues with AI assisted or generated content depending on the context. For example educational institutions may see this as cheating whilst corporate entities might fear legal challenges in relation to copyright and IP.


I can tell you that detecting AI plagiarism can be a challenging task, as AI models like ChatGPT are able to generate text that is similar to existing text. It is important to note that not all text generated by AI models is considered plagiarism, as it may be used in a legal, educational or research context. The best way to evaluate the accuracy of a plagiarism detection website would be to test it using a variety of inputs, and compare its results to those of other plagiarism detection tools.

Sorry.


I ran your comment through this Detector, and received the following message.

Your text is likely to be written entirely by AI

False positives and all.


Now you've said this I read that comment and it has the ChatGPT tone for sure.


Yes, it was ChatGPT indeed.


I copied a text from wikipedia - it classified it as human written, i then asked gpt to rewrite it and it was still classified as a human..

Would be interesting how much they actually catch.


This only calculates perplexity and burstiness. I don't think that's going to work very well. It would be much better to try and detect whether the distribution from which a piece of text was drawn is closer to that of a human, or that of a large language model.

But how would one go about detecting something like that? Well, one would need a model of human language trained to approximate the distribution of tokens in a large corpus of natural language... text...

Oh wait.


Interesting. It only seems to work in English. English texts that it successfully detects as AI generated are classified as human generated if you ask ChatGPT to translate them into a different language (or make the initial prompt in a different language). Perhaps they want to clarify this limitation, either through an FAQ or by stating that it doesn't support it when other languages are supplied in the prompt.


Does this use ML to determine whether or not ML is used? I wonder if it might be useful to try ChatGPT for detection, I’ve heard it’s pretty good…


I tried with a piece of text that was generated with ChatGPT and it said written by a human. I guess this still needs some improvement.


A nice approach - estimates the randomness in text. I tried with various cases - blog posts, wikipedia, scientific docs, and with a few examples from chat, and it figured it out accurately.

Respect. Even though I disagree with a need for such tools - it doesn’t matter if content was written by a human or by a machine. What matters is whether it’s easy to read and worthwhile.


Disagree, this is very much needed. The problem with AI-generated content is that it looks superficially worthwhile and plausible, but in fact it often says things that are not correct.


> The problem with AI-generated content is that it looks superficially worthwhile and plausible, but in fact it often says things that are not correct.

This is also a problem with human-generated content.


Sort of. But if I read a five page article by a human, and everything on the first couple pages checks out as correct, I expect the rest of the article to be at least reasonable. If it’s written by AI, it’s entirely possible that it’ll start in on something that doesn’t even make sense. Then I’ll realize that I can’t even trust the part that seemed correct.


Depends on the human


But that is true also of lots of content not generated by AI! Fact checking always needs to be done, AI generated or not. But does it matter that it was generated by AI?


What I find fascinating is that aspect of AI-generated content actually captures what humans do all the time, confidently making incorrect statements, "bullshitting" with filler text that only pretends to be meaningful, making illogical statements that contradict what was stated earlier, etc.


Which is true for most human-generated content, too.


Depends, only the one that do the 10 minutes YouTube video and/or 500 words article to make money from ads.

If you dig well, there are people who do really good content out there


> it doesn’t matter if content was written by a human or by a machine

It matters a lot instead. On the top of my mind I can think about few reasons:

- if you've to train a NLP model you'd rather do with data that is not autogenerated as AI generated content is rarely used to train new models (like generating a bounce of dogs pic with dall-e and use as input for image detection may not create a precise model)

- if you pay content creators to generate content and they use ChatGPT that's definitely a breach on contract and also a problem

- many search engines (e.g. Google) already heavily penalize auto generated content

- avoid cheating at exams / officials test/certs


> Even though I disagree with a need for such tools

“Need” may be arguable, but why disagree?


Yeah what we really need is a language model that can take some text and remove everything that is void of substance, and run both human and machine writing through it.


We ought to require OpenAI to run something like this: a “Hey ChatGPT, this you?” endpoint that replies Yes/No/similarity score, and a creation timestamp, when you hit it with some text.

As regulations go, this one’s not too burdensome: expensive, but pretty cheap compared to training and running a large language model in the first place.


If OpenAI implements such an endpoint, then a big chunk of potential customers would just not use ChatGPT… I mean, if I’m a student planning to use ChatGPT to enhance my uni essays, but it turns out OpenAI actually can say to my uni if my texts were AI generated, well, I’m not gonna use ChatGPT at all.


This works right up until the model is open sourced or otherwise replicable.

Put another way, it only works in today's environment.

When ML hardware is as widely distributed as classical compute, and all the models are on HuggingFace, you will be back to 0


If it buys some time, that’s a good day’s work and I’ll be very satisfied. All you can really ask is that a regulation makes things better in the near term.

The future will know a lot more about the nature and implications of persuasively-humanlike ML than I do: it can take care of itself. Maybe by then hybrid writing will be the norm and not considered plagiarism, and we’ll all have trustworthy virtual assistants shielding us from scams. But in the meantime, there are some reasonable causes for concern, and this would help.


This is a pretty good idea, at least for commercial models.


Useless. Any text written by chatGPT gets marked as human, as long as you edit it a bit here and there. Other text refined through a couple of chiseling prompts are a complete miss.


I tried it on a section of an article I wrote with the help of ChatGPT. It marked some of my sentences as generated by AI while it didn't mark the ones coming from ChatGPT.


It would seem to be checking sentence structure, and if "too complex" -- embedded clauses and such -- it follows a hard-wired rule to report AI.


Consider trying out GPTKit https://gptkit.ai it has higher accuracy than GPTZero.


Disclosure: The tool is built by me. It uses 6 distinct methods to classify text with a 93% accuracy rate, based on testing a dataset of 100k+ samples.


Where are you hosting the dataset? Would love to help out, I'm building an open source data version control tool to help iterate on ML datasets.

https://github.com/Oxen-AI/oxen-release

Would be cool if we could get a community around the test dataset to insure that 93% accuracy rate. Then people can add their failure cases to the repo and then you can iterate on them.


I assume this is English only, as none of other languages with which I tried chatGPT found even one single word AI produced.


ChatGPT can generate infinite combinations of words in thousands of different styles. Accurate detection will never work.


Someone make a no-ChatGPT HN. submission filter (show every submission but ChatGPT-related ones).


You can easily build it with gpt3 api. And codex can help you if you don't know how to program :)


worked about 20% of the time/content for me. It does not seem to perform very well in languages other than english. However there is a huge market potential for whoever figure this out. (Although I believe this will be nearly impossible)


Fed a blog I know to be ChatGPT-written. It only detected 1/3 of the sentences.


Is there a chatgpt and human text dataset similar to gpt2 released by openAI?


I also wanted to train my own GPT detection model but unfortunately there isn't any publicly available dataset about outputs.


This is detecting my own written work as ai.


I'm not sure this is working.


teachers pet final boss hahaha




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: