Hacker News new | past | comments | ask | show | jobs | submit login
New AI classifier for indicating AI-written text (openai.com)
403 points by davidbarker on Jan 31, 2023 | hide | past | favorite | 337 comments

It can't possibly work reliably. It's going to be very challenging for honest kids because almost everyone is going to be cheating.

The reality is that learning to think and write will be harder because of the ubiquity of text generation AI. This may be the last generation of kids where most are good at doing it on their own.

On the other hand, at least a few will be able to use this as an instant feedback mechanism or personal tutor, so the potential for some carefully supervised students to learn faster is there.

And it should increase the quality of writing overall if people start taking advantage of these tools. It's going to fairly quickly become somewhat like using a calculator.

Actually it probably means that informal text will really stand out more.

I am giving it the ability to do simple tasks given commands like !!create filename file content etc.

It's actually now very important for kids to adapt quickly and learn how to take advantage of these tools if they are going to be able to find jobs or just adapt in general even if they don't have jobs. It actually is starting to look like everyone is either an entrepreneur or just unemployed.

Learning about all the ways to use these tools and the ones coming up in the next few years could be quite critical for children's education.

There are always going to be luddites of course. But it's looking like ChatGPT etc. are going to be the least of our problems. It is not hard to imagine that within twenty years or so, anyone without a high bandwidth connection to an advanced AI will be essentially irrelevant because their effective IQ will be less than half of those who are plugged in.

Schools are going to have to be reverse format: watch lectures at home, do homework in class.

> Schools are going to have to be reverse format: watch lectures at home, do homework in class

This model (“flipped classroom”) has been recommended by many for quite a long time (at least since internet video delivery for the purpose became practical, but ISTR the first suggestions actually predate even that) and the reasons just seem to keep growing.

Thanks, I was trying to remember what this was called!


I'm a huge supporter of this model. I used to think I was terrible at math. Once Khan Academy and other online resources were available, it all started to click for me. Turns out my problem was that I could never keep up with the lectures. Being able to pause and replay a lecture and then ask for help when I got stuck in my homework would have been amazing.

Many of my college classes are (and even high school classes! though I was coming out of HS in the tail end of covid) like this.

High school: Watch ~1hr lecture every night for next day's class, in which we did worksheets and the teacher would work through problems, but we were generally expected to have learned the material (teacher would not re-teach it).

College: Last semester I had a Data Structures class with a similar structure, but with 2-3 lectures a week. Same idea, we worked on practice sheets in class and had far more time to ask questions instead of having the entire concept taught to us. I much preferred this because some concepts covered, I had a very solid understanding of, so not wasting time in class was great.

That approach to education is quite common. There’s been more of a push over the past decade or so under the “flipped classroom” moniker, but it’s the same principle as reading at home, then working with that knowledge in a group where different ways of thinking may be discussed.

I currently have a class where the professor does a hybrid of that. Instead of a four-and-a-quarter hour class, he makes a one hour video the day before with the rote part of the lecture and tries to address anything that came up among students since the previous week's class. Then we have a three-and-a-quarter hour class which is mostly work and there's a minimal lecture to warm everybody up. While the medium is largely technical, it is a studio art class however so it was a bit different from most classes, anyway.

The nearby university already does this.

It... It sometimes works. It depends a LOT on the instructor.

Should also note: it only really works when all your lecturers share some information about workloads...

No, schools (and employers, and scientists, and everyone else) are going to have to come to terms with the fact that they will rapidly become irrelevant, and cannot possibly hope to continue operating in anything resembling their traditional role.

Your position is that humans should stop bothering with learning?

The primary purpose of schools is not "learning" but exercising power through gatekeeping and accreditation. For most disciplines, the Internet already contains learning resources that make the average school's curriculum look like utter garbage by comparison. We don't use schools for learning, we use them to decide who gets to do what in life.

Your positions are too confidently severe. For starters, self-directed education isn't effective for most young people. Some form of school need endure, remote or not. If you're going to make more dystopian leaps with AI as an infinite conceptual gap-filler, just remember that people need constructive socialization.

Why so you believe that is the purpose of schools?

Why does so much learning and research take place if these are not primary purposes of schools?

Why are schools so inefficient at teaching if it's their primary purpose? (E.g. students who've had no schooling can generally do one year of schooling before university, and then in university perform at the same level as people who were schooled for over ten yeras).

Source on that?

And yet somehow kids go to school and learn stuff. How is that possible?

There are approximately zero economic indicators showing this.

"Economic indicators" are entirely irrelevant for that discussion. This isn't going to be a gradual change either. At some point, GPT-n is going to be available with super-human capabilities in all tasks that matter, and then it's simply over, from one day to the next. Nobody will continue paying people to do things that AI can do better, faster, and more reliably, at lower cost.

One day eventually we will have general human equivalent AI, at least for the vast majority of work tasks. Sure, but that's as much a premise for a science fiction story as a prediction about the future.

We are absolutely nowhere near even close to beginning to know how to even start building such a thing. Chat bots, language models and image generators are fun tools that look amazing to people who don't understand how they work, but they're extremely rudimentary compared to real intelligence.

I'll make a counter-prediction. All the low hanging fruit in language model development have been picked. Like all technologies there's a steep part of the S-curve of development and that's where we are now, but you can't extrapolate that to infinity. We'll soon hit the top of the curve and it will level off, and the inherent limitations of these systems will become a severe obstacle to further major advances. They will become powerful, useful tools that may even be transformative in some activities, but they won't turn out to be a significant step towards general AI. An important step maybe, but not a tipping point.

No, that's not how jobs work. Comparative advantage means it's worth paying people to do a job even if you're better at it than they are, because you have more important things to do.

Hiring AIs to do something is extremely expensive. You're basically setting a warehouse of GPUs on fire.

Anyway, if it was true total factor productivity would be exploding, but it's actually kinda underperforming. (And automation almost always causes increased employment.)

Yeah but have you tried dealing with people? Just like horses replaced cars, for lots of reasons, an http endpoint, powered by AI, that could do the work of a person, would replace so many jobs. Or ATMs for example.

> Just like horses replaced cars, for lots of reasons

Other way round, right?

Humans (labor) are different from horses (capital) because 1. they actively participate in work, ie, they don't just literally do what you tell them 2. they actually signed up to work, whereas horses don't care. And 3. if you give them money, they'll also become your customers. Though, I don't know if that's a major factor for employers, even if there is that Henry Ford anecdote.

ATMs are a good example here because there are more bank tellers now than before ATMs were invented. (see Jevons' paradox)

Lol oops yeah I wrote that backwards. Too late to edit now, ah well.

There are some jobs where labor needs to care. Most tech jobs, for example. But there are lots of jobs, especially temp ones, that are about throwing as many bodies as you can afford at a problem, and don't ask questions or try to do it smarter. So 1 is actually a detriment in those kinds of jobs.

To point 3, Henry Ford aside, if businesses really wanted employees to be able to afford their goods, they'd stop offshoring jobs!

ATMs put bank tellers out of work. There do happen to be more bank tellers now than before because there are more bank customers needing more bank services, but my bank only needs X tellers at at time vs X+1 or 2 or 3, and they don't hire any for 24 hours services. It's a bit hard to see, because the number of tellers is higher now than before, but the question is how many more tellers would there be without said machines?

Maybe in training but presumably you buy the trained AI service or whatever and run on cpu. AWS machines with GPUs don’t even surpass roughly $25 an hour. That’s the low end for a desk worker.

Admitting my last comment is a bit false. Turns out ChatGPT is running on GPUs.

It is, but especially high memory ones, and presumably it's loaded on a bunch of them since it supports simultaneous queries.

True, but surely the cost of multiple machines is amortized by simultaneous queries?

You’re predicting something that has never happened before by invoking magic AI that can do any tasks without specifying real-world limitations that might come up.

> The reality is that learning to think and write will be harder because of the ubiquity of text generation AI. This may be the last generation of kids where most are good at doing it on their own.

I think the form will change, but the substance is going to be like it always has.

Those who give a damn, want to and are able to improve will be able to do it 100x by getting access to a new source of diversified ideas and view points, variations on their own, and information. And efficiently and more or less reliably delegating low-value stuff at a low marginal cost, freeing up bandwidth & increasing impact.

And the grifters who’re always looking for shortcuts and constantly try to game whatever system they are in without any desire whatsoever to learn anything or grow in anyway… will keep doing just that. And the only value they’ll be able to bring will be… access to a tool anyone (but, yes, not everyone) can access.

And that matters, because in the end many of the problems worth solving aren’t technical problems, but people problems.

Writing this, I realise I may have completely missed your point and gone way off topic here.

> it should increase the quality of writing overall if people start taking advantage of these tools

Perversely, it might also dramatically decrease reading, if there's no incentive for anyone to need to properly understand anything.

A pretty dire scenario :(

Maybe it'll force people to state their points clearly and avoid bullshitting, if no-one is willing to wade through unclear paragraphs. That might have a positive effect in the long run.

It's probably going to devalue memorizing facts and concepts in the same way the internet already has, although to a further extent. This could be really exciting as everyone now has a subject matter expert on hand to ask any question they could think of.

> It's probably going to devalue memorizing facts and concepts in the same way the internet already has [..]

OK, thought experiment: try to think of the brightest people you know/ever knew. Not the most successful, but the ones where after every interaction you were left thinking "wow, person X is really, really smart". I can think of a handful of people from over the decades, who worked in wildly different fields.

All of them were really good at recalling facts and at quickly grasping concepts. None of them ever needed anything explaining (or even saying!) twice.

Note: this isn't about people who excel at rote learning (like you might do for a school test), just there are people out there who are simply great at absorbing information.

ChatGPT isn't that, and my prediction is that it will never be that. People who are great at writing prompts for ChatGPT aren't going to be that, either.

>People who are great at writing prompts for ChatGPT aren't going to be that, either.

I don't see why this would be true. If you have evidence of some strong argument on why this would be true, I'd like to hear them.

As an anecdote, I work with young people (x<25) that don't seem to be worse at recalling facts than the older generation. The bright/geeky ones grew up reading up wikipedia and, in fact, have a wider grasp of facts/stories than the older generation (in my experience)

Subject matter exports are quick to point out the limitations and flaws of chatGPT in their domain of expertise.

I expect in a few years we will have versions that are both more accurate, and can cite their answers / do real research.

I think that's unlikely. The architecture these things currently use doesn't allow for the possibility of that. We'd need to engineer systems with a very different architecture in order to do it. So yes maybe eventually, sure, but we're a long way from that.

You can search a knowledgebase using embeddings today and it works pretty well.

Sorry, it is unclear if this comment is AI generated or not, I can't give you full credit for it.

(No seriously. "The classifier considers the text to be unclear if it is AI-generated.")

I just tried the op comment and got

> The classifier considers the text to be very unlikely AI-generated.

> The reality is that learning to think and write will be harder because of the ubiquity of text generation AI

is this just a baseless assertion or do you have something to back it up?

what I'm seeing is that a lot of kids will have a 1:1 coach for a lot of topics.

I know now it's not perfect, but this thing may be a few years away from having a likeable personality and fact checking what it says.

/This may be the last generation of kids where most are good at doing it on their own./

Ironically, students almost all suck at writing...

It's going to falsely label any recent innovation in language as generated, because the training set will skew human-written as old and GPT generated as new language.

I've defeated it already using basic prompt engineering.

Is prompt engineering what kids are calling typical hacking behavior these days?

This seems like a sort of unwinnable arms race. Can't the people who work on generative text models use this classifier as a feedback mechanism so that their output doesn't flag it? I'm not an AI expert, but I believe this is even the core mechanism behind Generative Adversarial Networks.

Detectors can be a black box "pay $5 per detection" type service.

That way, you can't fire thousands of texts at it to retrain your generative net.

Plagiarism detectors in schools and universities work the same. In fact, some plagiarism detection companies now offer the same software to students to allow them to pay some money to pre-scan their classwork to see if it will be detected...

$5 is way too high of a price to use regularly. In any case, if it's only available to education institutions, teachers and grad students are poor enough to sell access to it to people on the dark web for the right price.

Make a model to detect cheating. Market it as "a custom built and unique model to detect cheating; able to catch cheating that other models miss!" It's all 100% true. Market and profit.

There's also always going to be more capital going towards building better generators than better detectors.

But detection is an easier problem fundamentally. In fact part of the novelty of chatgpt is that it cannot be detected quite as easily

Language Models produce a high probability sequence of words given history (or an approximation of it). This is the only paradigm that we know works for language synthesis.

What the creators of this page did is turn that into its head, and use exactly that reasoning to identify candidate passages as computer generated, exactly because they have access to those probabilities, so it's not a viable approach to improving the language model directly.

With ChatGPT however, we have 2 models working , a language model, and a ranking model. The ranking model is trained to order the results of the language model to look better to humans. The suggested approach could be used to help fit the model by ranking lower probability sequences higher, but this comes at the cost of increased computation time by generating many more sequences, and constructing incoherent output.

> This is the only paradigm that we know works for language synthesis.

No, it's the easiest paradigm we know works for language synthesis. The other way to synthesize language is to understand what you're saying. This is "old-school" AI (we wouldn't even call it AI now), done with if statements, expert systems, and queries of a robust, structured data model. The bullshitting capabilities of neural networks have skyrocketed so far as to dwarf the "expert system" approach, but it's still there, slowly getting better, and still the right choice for many situations.

What I'm excited about is combining the capabilities of both. Right now there's a huge gap between the two.

Jup, an arms race indeed. With the companies involved selling to both sides, as in any good conflict... :|

You're right, that's the core mechanims of GANs. The current state of the art models aren't using a GAN structure, but it's plausible that they achieve state of the art numbers in the future

I foresee a dystopian education outcome:

1. Classifiers like this are used to flag possible AI-generated text

2. Non-technical users (teachers) treat this like a 100% certainty

3. Students pay the price.

Especially with a true positive rate of only 26% and a false positive rate of 9%, this seems next to useless.

> 2. Non-technical users (teachers) treat this like a 100% certainty

This is the part that needs to be addressed the most. Teachers can't offload their critical reasoning to the computer. They should ask their students to write things in class and get a feeling for what those individual students are capable of. Then those that turn in essays written at 10x their normal writing level will be obvious, without the use of any automated cheat detectors.

I was once accused of cheating by a computer; my friend and I both turned in assignments that used do-while loops, which the computer thought was so statistically unlikely that we surely must have worked together on the assignment. But the explanation was straight forward; I had been evangelizing the aesthetic virtue of do-while loops to anybody that would listen to me, and my friend had been persuaded. Thankfully the professor understood this once he compared the two submissions himself and realized we didn't even use the do-while loop in the same part of the program. There was almost no similarity between the two submissions besides the statistically unlikely but completely innocuous use of do-while loops. It's a good thing my professor used common sense instead of blindly trusting the computer.

I think you're misunderstanding the primary purpose of essays.

Teachers don't have the time to do deep critical reasoning about each student's essay. An essay is only partially an evaluation tool.

The primary purpose of an essay is that the act of writing an essay teaches the student critical reasoning and structured thought. Essays would be an effective tool even if they weren't graded at all. Just writing them is most of the value. A big part of the reason they're graded at all is just to force students to actually write them.

The main problem with AI generated essays isn't that teachers will lose out on the ability to evaluate their students. It's that students won't do the work and learn the skills they get from doing the work itself.

It's like building a robot to do push ups for you. Not only does the teacher no longer know how many push ups you can do, you're no longer exercising your muscles.

>> The primary purpose of an essay is that the act of writing an essay teaches the student critical reasoning and structured thought. Essays would be an effective tool even if they weren't graded at all. Just writing them is most of the value. A big part of the reason they're graded at all is just to force students to actually write them.

That's our problem, I think. Education keeps failing to convince students of the need to be educated.

I think that students know they need to be educated, but they also know that grading/academic success, in the form of good grades and going to prestigious universities, matters more than actual knowledge in the real world. And the funny thing is that if you teach critical reasoning to someone, there's a good chance they will use that skill to realize that the grade of the essay matters more than the actual process of writing it.

I think companies face a similar problem when they try to introduce metrics to evalute performance, either of individual employees or of whole parts of the company, and people start focusing on gaming these metrics instead of doing what's actually beneficial to the company. One reason for that is probably that it's really hard to evalute what actually beneficiates the company, and what part you played in it.

Back to students, maybe writing that essay instead of asking GPT-3 is more beneficial in the long run, but on the other hand you're also learning to use a new tech that will keep getting better, but maybe you're not learning the "value of hard work correctly", etc etc. Evaluating what's good for you is very hard, focusing on a good grade is easier and has noticable positive results. I think getting educated is very important, but I also think no one can certainly known if learning to use AI is actually a worse thing that doing stuff "yourself".

All in all, it's a very hard problem. It's trying to see the consequences of our own actions in very complex systems. And different people work differently. For example, when I use ChatGPT or Copilot, I end up spending more time overall working, and producing way more stuff even without counting what the AI "produced", because the back and forth between me and the AI is a more natural way of working for me. In the same vein, it's easier for me to write or even think by acting out a conversation. Maybe for some people it's the exact opposite and they need to be alone with their thoughts to be more productive.

Delaying gratification is hard for all of us. We're just primates doing the best we can with our limited wetware.

Seem like it would be fairly trivial to make a document writer that measured if a human was doing the typing such that it was much more likely to have been written by a human sitting and thinking at a keyboard. We do it in ad fraud detection all the time at scale with much less willing participants.

The value of a degree is very clear.

The value of an education is much less clear.

I'm saying the students are probably right.

> It's like building a robot to do push ups for you. Not only does the teacher no longer know how many push ups you can do, you're no longer exercising your muscles.

While I already knew what you have described, I love this analogy, it's really spot on.

For this exact reason, I feel like education systems and curriculum providers (teachers are just point of contact from a requirements perspective) should develop much more complex essay prompts and invite students to use AI tools in crafting their responses.

Then it’s less about the predetermined structure (5 paragraphs) and limited set of acceptable reasoning (whatever is on the rubric), and more about using creative and critical thinking to form novel and interesting perspectives.

I feel like this is what a lot of universities and companies currently claim they want from HS and college grads.

This is what I'm doing as an instructor at some local colleges. A lot of the students are completely unaware of these tools, and I really want to make sure they have some sense of how things are changing (inasmuch as any of us can tell...)

So I invite them to use chatGPT or whatever they like to help generate ideas, think things out, or learn more. The caveat is that they have to submit their chat transcript along with the final product; they have to show their work.

I don't teach any high-stakes courses, so this won't work for everyone. But educators are deluded if they think anyone is served by pretending that (A) this doesn't/shouldn't exist, and that (B) this and its successors are going away.

All of this stuff is going to change so much. It might be a bigger deal than the Internet. Time will tell.

I like this technique. You could also take a ChatGPT essay and have the students rewrite it or analyze for style.

Or have a session on how to write the prompts to generate the good stuff. In the hands of a skilled liberal artist, the models produce amazing results.

Yes the tool is powerful, but it still requires skills, knowledge and an ascetic voice.

A student can't go from zero to "much more complex essay prompts", though. Education has to go step by step. The truth is that humans start at a lower writing skill that ChatGPT. Before getting better than it, they need to first reach its level.

And then, there is the problem that those complex prompts might also become automatable when GPT-4 or GPT-5 is released.

>Teachers don't have the time to do deep critical reasoning about each student's essay.

Projection much? Who are you speaking for? What countries, what states?

It's difficult to dive in that deep into someone's essay in any case. That's the challenge, not the lacking quality of one's education system.

I read every student essay I grade twice. Small classes, admittedly, but this has always been my practice.

   ask their students to write things in class and get a feeling for what those individual students are capable of. Then those that turn in essays written at 10x their normal writing level will be obvious

I think that's a flawed approach. Plenty of people simply don't perform or think well under imposed time-limited situations. I believe I can write close to 10x better with 10x the time. To be clear, I don't mean writing more, or a longer essay, given more time. Personally, the hardest part of writing is distilling your thoughts down to the most succinct, cogent and engaging text.

> Plenty of people simply don't perform or think well under imposed time-limited situations

From first-hand experience, the difference between poor stress-related performance and a total lack of knowledge is night and day.

I have personally witnessed students who could not speak or understand the simplest English, and were unable to come up with two coherent sentences in a classroom situation, but turned in graduate level essays. The difference is blindingly obvious.

> I have personally witnessed students who could not speak or understand the simplest English, and were unable to come up with two coherent sentences in a classroom situation, but turned in graduate level essays. The difference is blindingly obvious.

Maybe someone helped them with their homework?

Unless their in-class performance increases as well, isn't that help "probably cheating"? (That's the "moral benchmark" I'd use, at least; if your collaboration resulted in you genuinely learning the material, it's probably not cheating.)

The point is for the teacher to get a sense of the students style and capabilities. Even if your home essay is 10x better and 10x more concise as your in class work, a good teacher that knows you—unlike an inference model—will be able to extrapolate and spot commonalities. Also a good teacher (that isn’t overworked) will also talk to students and get a sense of their style and capabilities that way, this allows them to extrapolate even better then a computer could ever hope to.

Sure, but what about all the students with mediocre and/or overworked teachers? If our plan assumes the best-case scenario, we're going to have problems.

Honestly if we can’t have nice things and we keep skimping out on education, I’d rather we just accept the fact that some will students cheat, then to introduce another subpar technical solution to a societal problem.

> blindly trusting the computer.

Professors blindly trust the computer not out of laziness, but to protect themselves from accusations of unfairness...

"The work was detected as plagiarism, but the professor overrode it for the pretty girl in class, but not for me"

Seems like something like this should only be used as a first-level filter. If the writing doesn't pass, it warrants more investigation. If no proof of plagiarism is found, then there's nothing else to do and professor must pass the student

with a 26% true positive rate that seems flawed.

I asked chatgpt to write an essay as if it were written by a mediocre 10th grader. It did a reasonably good job. It threw in a little bit of slang and wasn’t particularly formal.

Edit. I sometimes tell my students “if you’re going to cheat, don’t give yourself a perfect score, especially if you’ve failed the first exam. It fires off alarm bells.”

But the students who struggle usually can’t calibrate a non-suspicious performance.

I guess the same applies here.

You've touched upon a central issue that is not often addressed in these conversations. People who have difficulty comprehending and composing essays also struggle to work with repeated prompts in AI systems like ChatGPT to reach a solution. I've found in practice that when showing someone how prompting works, their understanding either clicks instantly, or they fail to grasp it at all. There appears to be very little in between.

seems like this is the future... 1. first day of class, write a N word essay and sign a release permitting this to be used to detect cheating. The essay topic is chosen at random.

2. digitize & feed to learning model, which detects that YOU are cheating.

upside: this also helps detect students who are getting help (e.g. parents)

downside: arms race as students feed their cheat-essays (memorize their essays?) into AI-detection models that are similarly trained.

The funniest implication here is that the student's writing skill isn't expected to improve.

I was just asking my partner who’s a writer if it would even be fair to train a model based on a student at Nth grade if the whole point is to measure growth. Would there be enough “stylistic tokens” developed in a young person’s writing style?

Personally, I feel mildly embarrassed when reading my essays from years prior. And I probably still count as a 'young person'.

That said, there's no need to consider changes in years when stylistic choices can change from one day to another depending on one's mood, recent thoughts, relationship with the teacher, etc.

That's why I've always been a little confused about how some (philologists?) treat certain ancient texts as not being written by some authors due to the text's style, as if ancient people could not significantly deviate from their usual style.

> first day of class, write a N word essay

Initially I thought you meant having the student write an essay about slurs, as the AI will refuse to output anything like that. Then I realized you meant "N" as in "Number of words".

Still, that first idea might actually work; make the students write about hotwiring cars or something that's controversial enough for the AI to ban it but not controversial enough that anybody will actually care.

> upside: this also helps detect students who are getting help (e.g. parents)

Downside: it also likely detects, without differentiation, students whose writing style undergoes a major jump because of learning, which is, you know, the actual thing you are trying to promote.

> first day of class, write a N word essay and sign a release permitting this to be used to detect cheating

Why once? Most students need writing skills more than half the high-school curriculum.

There are also some countries that don't fetishize cheating this much so perhaps they will just continue not caring.

Arms race are not really an issue, you've managed to make your student work, one way or another.

Programming is fortunately one of those subjects where there's something objectively close to a correct/optimal solution. A trivial example is that there aren't very many sane ways to write a "Hello world" program, but this seems to hold for more complex tasks too. In fact, in my experience, the ones who cheat and get it wrong are the most obvious.

Unfortunately, the software industry also has plenty of literal tools who are far too trusting of what the computer says (or authority in general, but that's another rant...)

I once got called up because my work was flagged as 100% copied. I had uploaded it, made a mistake so I deleted it and uploaded a new file. Second file was flagged as copied. Was able to explain it by pointing at the screen that was claiming I plagiarized my own name.

So the computer’s evaluation model assumed that each student’s learning is independent? That seems like a ludicrous assumption to put in a model like this, unless the model authors have never been in a class setting (which I doubt).

You are asking teachers to be good at their job. But is teaching a merit-based profession?

So, status quo then? This is already the case for educational software that's used to detect plagiarism. People get wrongly flagged, and then you'll have to plead your case.

But the times software like this finds actual problems vastly outnumbers of times it doesn't, and when you choice is between "passing kids/undergrads who cheat the system" and "the occasional arbitration", you go with the latter. Schools don't pay teachers anywhere near enough to not use these tools.

Currently the false positive rate is far lower. E.g. I get 500-ish submissions over a school year then a 1% false positive rate would mean I'd falsely accuse 5 innocent students annually, which isn't acceptable at all - and a 9% FP rate is so high that's even not worth investigating; do you know of any grader who has the spare time to begin formal proceedings/extra reviews/investigation for 9% of their homework?

For plagiarism suspicions at least the verification is simple and quick (just take a look at the identified likely source, you can get a reasonable impression in minutes) - I can't even imagine what work would be required to properly verify ones flagged by this classifier..

I really wish they'd have provided their false positive rate over several lengths of document, rather than an overall estimate. Because if it dives after say, 1,500 words, that's a relevant piece of information for its use.

I'm pessimistic, given they chose not to do so.

> I can't even imagine what work would be required to properly verify ones flagged by this classifier.


At the same time the classifier is improving, the generative models are improving. It’s a classic arms race and this equilibrium is not likely to shift much either way. We are talking about models that approximate human behavior with a high degree of accuracy, I think the goal would be to make them indistinguishable in any meaningful way.

Can you elaborate?

I don't think that this is something that can change through tech advances for the classifiers - in all cases the classifier is just flagging for investigation, it's not sufficient for any action. For plagiarism, appropriate evidence comes from a person comparing the submission with the possible source of plagiarism. For this one, the proper evidence would require getting confirmation that the student actually generated that data - e.g. identifying the exact tool and prompt that was used, or logs from the students' computer showing that this was done, or logs from the text generation service provider. All of those are quite tricky to get and perhaps even not possible.

Given the published true and false positive rates, it's clear that the true positives do not "vastly outnumber" false positives.

> This is already the case for educational software that's used to detect plagiarism. People get wrongly flagged, and then you'll have to plead your case.

How often is that the case though? A while since I've had to worry about it, but I thought plagiarism detection generally worked on the principle of looking for the majority of the content being literal matches with existing material out there with only a few small edits, which - unlike using some "AIish" turns of phrase a bot wrongly attributes to humans 9% of the time and correctly attributes to AI with a not much better success rate - is pretty hard to do accidentally.

A long time ago when I was a student, I would run my papers through Turnitin before submitting. The tool would sometimes mark my (completely original) work as high as mid 20% similarity.

As a result, I have taken out quotes and citations to appease it and not have to deal with the hassle.

I expect modern day students will resort to similar measures.

IIRC the marker got the same visualization that you used to take out quotes and citations that highlighted that the similar bits were in fact quotes and citations!

Maybe high school is a different matter, but I'm pretty sure even the most technophobic academic knows that jargon, terse definitions and the odd citation overlapping with stuff other people have written is going to make a similarity of at least 10% pretty much inevitable, especially when the purpose of the exercise is to show you understand the core material well enough to cite and paraphrase and compare it, not to generate novel academic insight or show you understood the field so well you didn't need to refer back to the source material. The people they were actually after were the ones that downloaded something off essaybank, removed a couple of paragraphs and rewrote the intro to match the given title and ended up with 80%+ similarity

Is there a longer-form paper on this yet? TPR (P(T|AI)) and FPR (P(T|H)) are useful, but what I really want is the probability that a piece flagged as AI-generated is indeed AI-generated, i.e. P(AI|T). Per Bayes rule I'm missing P(AI), the portion of the challenger set that was produced by AI.

If we assume the challenger set is evenly split 50-50, that means

    P(AI|T) = P(T|AI)P(AI)/P(T) = (0.26)(0.5)/(0.26+0.09) ~ 37%
So slightly better than a 1/3 chance of the flagged text actually being AI-generated.

They say the web-app uses a confidence threshold to keep the FPR low, so maybe these numbers get a bit better, but very far from being used as a detector anywhere it matters.

>Per Bayes rule I'm missing P(AI), the portion of the challenger set that was produced by AI

This will obviously depend on your circumstances.

Precision is impossible to calculate without knowing P(AI), which is use-case specific.

Source: Spent 10 years trying to explain this to government people who insisted that someone tell them Precision based purely on the classifier accuracy without considering usage.

We can’t release the essay writing language model. Lazy children will use it to write their essays for them!

We can’t release the ai-generated text detection model. Lazy teachers will use it to falsely accuse children of cheating!

The problem here appears to be lazy people.

Can we train an AI to detect lazy people? I promise not to lazily rely on it without thinking.

Hilariously, this has already happened with music composition. Especially drumming.

Since the advent of drum machines, a lot of younger players have started playing with the sort of precision that drum machines enable. eg: The complete absence of swing, and clean high-tempo blasts/rides.

So you'd get accusations of drummers not being able to play their own songs, because traditional drummers think such technically complex and 'soulless' performances couldn't possibly be human. Only to then be proven wrong, when it turns out that younger players can in fact do it.

The machine conditions man.

I can't remember the keyword to look it up, but there's a problem of statistics you run into with stuff like terrorism detection algorithms

If we have 300M people in the US and only 1k terrorists, then you need 99.9999% accuracy before you start getting more true positives than false positives. If you use this in a classroom where no one is actually using AI you'll get false positives, and in a class where the usage is average you'll still get more false positives than true ones, which makes the test do more harm than good unless it's just a reason to look into it more - and the teacher is presumably already reading the text so if that doesn't help than this surely won't

I wonder if I should help my kids setup a server + webcam + screen capture tool so they can document 100% of their essay writing experience. That way if they ever get hit with a false positive they can just respond with hundreds of hours of video evidence that shows them as the unique author of every essay they've ever written.

You will certainly have a lot of training video to create a "essay writing video generator" ml product

You could always teach them how to use git and have them commit frequently. Seems like it would be less intrusive than a webcam.

Source control would certainly help establish a history of incrementally performing school work by someone when viewed by a highly technical examiner and when periodically stored someplace where a trusted 3rd party can confirm it wasn't all generated the night after a supposed false positive.

However, hundreds of hours of video is compelling to non-technical audiences and even more importantly is a preponderance of evidence that's going to be particularly damning if played in front of a PTA meeting.

With a git history it's going to come down to who can spin the better story. The video is the story and everyone recognizes it, so I expect fewer people would bother even challenging its authenticity.

I guess that's fair. I just personally don't think the additional gain is worth taking away your child's privacy.

It's only taking away their privacy if they're falsely accused.

And properly used you might not even have to relinquish privacy if falsely accused. A quick montage video demo and a promise to show the full hundreds of hours of video of "irrefutable" proof to embarrass the school district at the next PTA meeting might be sufficient to get the appropriate response.

You could still cheat quite easily and inexpensively with an earpiece, as long as you know how to write down what you hear.

It's about building a narrative. Yeah, you could still cheat, but who would go through the effort of generating hundreds of hours of fake videos proving yourself innocent. For that amount of effort you might as well have done the work yourself.

Of course there are some people who put insane amounts of effort into not doing "real" work. However, anyone trying to prove that your child is in that position is going to find themselves in an uphill battle.

Which is the ultimate goal here. Make people realize that falsely accusing my children using dubious technology is going to be a lot more work than just giving up and leaving them alone.

This is already an issue, I'm a student in college right now and even technical professors are operating with full confidence in systems like turnitin which try their hand at plagiarism detection (with often much higher false negative/false positive rates). The problem was even more prevalent in high school where teachers would treat it as a 100% certainty. Thus, I think that OpenAI making atleast a slightly better classification algorithm won't make the state of affairs any worse.

The cheating students who know how to use the classifier will be the big winners.

I think there is a more dystopian near future:

1. There will be commercial products to tune per-student writing models.

2. Those models will be used to evaluate progress and contribute directly to scores, grades, and rankings. They may also serve to detect collaboration.

3. The models persist indefinitely and will be sold to industry for all sorts of purposes, like hiring.

4. Thy will certainly be sold to the state for law enforcement and identity cataloging.

It's almost as if you need to give exams in person and watch the students if you don't want them to cheat. This is fundamentally no different than cheating by writing notes on your hand in an exam or paying someone to write a take-home essay for you. It's cheaper than the latter, but that just means the lazy curriculum finally needs to be updated.

> false positive rate of 9%

Yeah, that is useless. You couldn't punish based on that alone and students will quickly figure out to never confess.

I urge anyone with time to write to tech journalists explaining why this is so bad. Given previous coverage of GPTZero they don’t seem to be asking the right questions.

Sorry for the tangent but a surprising number the general public doesn't know the meaning of percent[1]. So even if a teacher is told those percentages many wouldn't know what to conclude.

[1] Me, giving young adults that worked for me a commission rate. Then asking if their commission rate is 15% and they sell $100 of goods what is their payment. Many failed to provide an answer.

I dare hope for a less dystopian outcome:

- teachers will assign less mind-numbing essay homework assignments and focus more on oral interviews.

That heavily favors a particular learning style, which isn't necessarily a desirable outcome.

You can't make everyone happy.

Generally speaking, education (when done correctly) tries to avoid "...and devil take the hindmost" as a guiding philosophy.

Mass education is like mass transit. It gets the majority of the population somewhere. Not everyone gets to take the ferrari. Someone will always be left out and we shouldn't let perfect be the enemy of good enough.

...but if we are forced to choose, it's better to spend our effort on the ones who don't own a Ferrari.

That's exactly the point - mass education. Busses. Not ferraris - even for people who might need that level of support. It isn't economical or a net benefit for society to make sure everyone's needs are met to the best possible point.

I guess students will get recorded writing their homework, say on a tablet.

Then of course the AI can whisper the student what to write to your ear. So perhaps homework will have to be done at school? School that checks its students with a metal detector when they enter. (Some schools use them already to check for guns?)

On a side note Im very shocked how lax is everything in those proffessional chess tournaments. It feels there are many ways to cheat and they dont try to do anything against cheating. They should use metal detectctors (to detect computers inside stomach or tooth), they should host everything inside a bunker (so no radio), without audience (who can do various tricks) and in a secured environment (all cameras chcecked to be sure they are legit).

Those chess tours look like cheating galore for me, although I dont play chess.

Hopefully they just flag relevant sections. Essay/Plagiarism checkers already exist, although in my experience professors were reasonable.

For example I had a paragraph or two get flagged as being very similar to another paper - but both papers were about a fairly niche topic (involving therapy animals) and we had both used the relevant quotes from the study conclusions from one of only a few decent sources at the time - so of course they were going to be very similar.

Given that most essays are about roughly the same set of topics, and there are literally hundreds of thousands of students writing these - I wonder how many variations are even possible for humans to write as I would expect us to converge on similar essays?

Plagiarism is easier to verify, because you can directly compare with the plagiarized source material

Absolutely. I think it may have to end up more as a statistics thing with behaviour. For example:

"Tom had a single paragraph flag as possibly generated" vs "Every single paper Tom writes has paragraphs flag"

Basically we might have to move to detecting statistical outliers as cheating. Now whether the tools/teachers will understand/actually do that - we can only hope....

That's a good point: the effectiveness at detecting AI generation is probably going to depend strongly on the length of the text.

>false positive rate of 9%

bringing the Roman decimation to the classroom based on AI, this is the future

Also, there will exist

Prompt => AIGen (White Hat) => Obfuscate(Black Hat) => Final Text

I think the much more proximate threat is that fear of ChatGPT kills a lot of progress that's been made in making exam material more accessible (take home tests, etc.) to a broader audience of students.

This is worse than useless, if taking base rate fallacy into account.

I imagining future that involves programs monitors students writing in proctored setting to establish some sort of individual finger print and use that to match against future writing assignments, again persistently monitored for authenticity. Clippy going to pop up in the corner to warn you when you've been behaving too artificial. Whatever that means.

A more likely outcome is that teachers will pay the price [1].

[1] https://www.timeshighereducation.com/opinion/ai-will-replace...

(turn off js to jump signup-wall)

This isn't that dystopian. The dystopian outcome is when there's a classifier that rates the quality of the text and that this classifier becomes indistinguishable from the AI-generated classifier because AI generated text is beginning to be superior to human generated text.

Ah but as with AI generally before now, 'you can't stop progress.' It'll end up being used and falling into an arms race of better AI vs better detection, all the while losing the point of why it is there at all in the first place.

Exactly IMHO it is irresponsible to release such classifier with a title that touts the desired feature and totally do not spell its limitations. At least precede such title with experimental or something.

Or we realize that essays aren't that important and technical skills will become more highly valued. Either way, ChatGPT can't do your exams for you so the truth will come out anyway.

Writing is very important for understanding a topic and long-term recall. I still remember topics from papers I did 15 years ago because I spent 10s of hours researching and writing and forming ideas about each topic.

Instead of being overzealous about catching cheaters, teachers should learn to express the importance of writing and why it is done. Convince the students that they should do it to be a smarter person, not just to get a grade, and they will care more about doing it honestly.

Writing is itself a technical skill

With ai taking over technical skills, it seems clear to me that they will be values less. Instead, the soft skills will be the valued ones

Any solution here is just an arms race. The better AI's get at generating text, the more impossible the job of identifying if an AI was responsible for writing a given text sample.

You could even just set up a GAN to make the AI better at not being detected as something written by an AI, I don't see a good general solution to this, but I also see it as a non-issue - if students have better tools they should be able to use them, just like a calculator on a test - that's allowed on tests because you still need to understand the concepts to put it to use

4. Parents sue schools 5. Admins eliminate all writing requirements

Let me soothe your fear: This isn't a novel cheating technology, it's a technology that will make humans obsolete. Neither teachers nor students are going to matter in the future. Most or all of the population is going to be enslaved for all practical purposes, either to an all-powerful super-elite, or to AI itself. Any worries about how mundane things like education are going to be impacted by petty cat-and-mouse games are going to become irrelevant, because education itself is going to be irrelevant, along with everything else that once defined our world.

> true positive rate of only 26% and a false positive rate of 9%

That's uninformative enough that I'm surprised they launched this publicly at all.

Maybe they should start asking questions that AI can’t answer, instead of having students regurgitate what they’ve memorised.

In the same way deepfake video should not be allowed as evidence, thereby ensuring no video is allowed… we can apply that to text as well.

We’re entering an uncanny valley before a period of “reset” with self taught (to stay on subject here) people re-learning for the sake of learning.

In 30 years we will be in an educational renaissance of people learning “like the old masters did in the 1900’s.”

Nah. In 30 years it will be as useless to learn most subjects as it is right now to learn crocheing and knitting, or learning times tables or using an abacus.

People are wayyyy too optimistic, just like in the 1900s they thought people would have flying cars but not the Internet, or how Star Trek’s android Data is so limited and lame.

Bots will be doing most of the work AND have the best lines to say, AND make the best arguments in court etc.

You don’t even need to look to AI for that. The best algorithms are simply uploaded to all the bots and they are able to do 800 things, in superhuman ways, and have access to the internet for whatever extra info they need.

When they swarm, they’ll easily outcompete any group of humans. For example they can enter this HN thread and overwhelm it with arguments.

No, the old masters were needed. Studying will not be. The Eloi and Morlocks is closer to what we can expect.

Apparently knitwear is forecasted to have a CAGR of 12% the rest of the decade. With hand knitted garments commanding the high prices. It's definitely not the worst cottage industry one can chose.

As someone who’s known how to crochet and knit since he as 6… I disagree.

yup - you can beat em using simple things like mixing up words to throw off the word distribution. GPT-Minus1 is an exmaple.


Solution: just write your texts with a bit less confidence than gpt3 would.

Funny how everyone praised GPTZero that has even worse rates but starts being skeptical when it's OpenAI, the new bad guy.

"Everyone" didn't. In fact, the 5 top comments in that thread[1] all called it useless or pointed out serious flaws.

[1] https://news.ycombinator.com/item?id=34556681

This is extremely concerning.

The co-author on this is includes Professor Scott Aaronson. Reading his blog Shtetl-Optimized and reading his [sad/unfortunate/debate-able/correct?/factual?/biased?] views on adverse/collateral harm to Palestinians civilians makes me question whether this model would fully consider collateral damage and harm to innocent civilians, whomever that subgroup might be. What if his model works well, except for some minority groups' languages which might reflect OpenAI speak? Does it matter if the model is 99.9% accurate if the 0.1% is always one particular minority group that has a specific dialect or phrasing style? Who monitors it? Who guards these guards?

I found a great way to fool these detectors: piping output through multiple generative models.

1. Generate text by promoting ChatGPT.

2. Rewrite / copyedit with Wordtune [1], InstaText [2] or Jasper.

This fools GPTZero [4] consistently.

Of course soon these emotive, genre or communication style specialisations will be promptable too by a single model too. Detectors will be integrated as adversarial agents in training. There is no stopping generative text tooling, better adopt and integrated it fully into education and work.

1. https://www.wordtune.com/

2. https://instatext.io/

3. https://www.jasper.ai/

4. https://gptzero.me/

Now they get to monetize Chat GPT and this new classifier. Starting fires and providing the extinguishers, charging for both of them.

All while pretending to be morally responsible in order to do it.

No way. If I were a student trying to use ChatGPT in order to improve my writing, I would definitely not pay for it if I know my teachers are using their AI Classifier. I mean, what's the point? I don't think OpenAI will be able to reach that (big) chunk of potential customers that want to use ChatGPT to write essays, social media comments, etc. if OpenAI at the same time sells their classifier. It's just nuts.

> trying to use ChatGPT in order to improve my writing

Grammarly also helps improve your writing. It can be used as a guide, rather than as direct output, just as the teachers red pen does after you turn in the assignment. So can ChatGPT.

I think the interesting risk is that it starts flagging Grammarly output, and people who's education was influenced by AI "tutors".

It would be like punishing the student for heeding the red marks, from their teachers.

But you could just pay for both ChatGPT and the AI classifier, and keep re-iterating with new prompts until the AI classifier outputs a false negative right?

Edit: Thinking about it, they'd probably have to eventually restrict the AI classifier so that it would only be available in to schools / institutions in this scenario.

You make a well reasoned argument here. At the same time, respectfully, you may be too intelligent to be the target audience for the student service. Can you see a college version of yourself paying $500 to write a college essay for yourself today?

If you're in a STEM-y major, now is the time to pick up an essay heavy humanities degree. If you're in an essay heavy humanities degree, now is the time to pick up a few more.

Think of it like this: How much is your degree costing you/your family?

On average, it's ~$150k.

How much would an extra degree cost you? How about 80% of an extra degree? How about 20%? How about all those books and course materials? Those are in the $1000s already, per degree. (and yes, we all have head of torrents).

What I'm saying is that chatGPT can easily be seen as 'just another college cost'. And when it's 'for education', the justification for those costs gets a lot more flexible. I can see students spitting out ~$10,000 for something like chatGPT that is specific towards their major, will pass these classifiers, and gets you just ~25% of the way to your major (however that is defined). The cost 'for the masses' could easily be in the ~$1000s for a per class subscription.

With ~20M college students in the US, assuming even a 10% uptake rate, you're in the billions of dollars of nearly pure profit (the overhead would be negligible).

The money potential of something like chatGPT is just too damn high. Too high for essays to ever go out of style, as the lobbying effect of companies like this will force colleges to keep these essays that they are making the money off of. Oh, any they'll sell the classifier to the colleges to. Arming both sides!

They did say big tech was starting to take over the role of government.

What does this have to do with the government?

I can see the point the parent comment is trying to make. The applications of this classifier include potentially arbitrating in decisions relating to things like education (ie assessment of grades) which is a matter traditionally associated with the public sector.

I was saying sometimes government is both the cause and solution to some problems.

Climate change for instances. It's investing in ways to combat it but also enabling oil companies to thrive.

Unfortunately, in a lot of ways, it already has.

There was a merchant who said - Buy my sword! It will pierce through any shield !!

So the gullible people bought the swords and soon the merchant ran out of swords to sell.

So the merchant said - Buy my shield! They can defend against any sword !!

Once again the gullible people rushed to buy the shields.

But one curious onlooker asked - what happens when your sword meets your shield?

A compound noun meaning sword-shield, 矛盾 mujun, is the word for contradiction in Japanese, based on this Chinese folktale.

ChatGPT doesn't make any promises to beat AI text classifiers. If you asked it to it'd probably tell you that's unethical.

The merchant in this analogy is OpenAI not ChatGPT. ChatGPT is the sword.

Sure, but I think in this case they're promising the shield will always win. It's other people that might develop AIs that aren't detectable this way.

(Actually, OpenAI has said they're trying other ways of making GPT output even more detectable, like watermarking it through specific word choices.)

It depends whos holding them I guess.

The existence of this tool might actually do more damage if people are using with any level of confidence to check text content as important as exams. I understand why they felt the need to release something, but I think it would be better if this didn't exist.

My guess is that it's very easily gamed. Something ChatGPT is very good at is producing text content in different styles so if you're a student and you run your text through a AI detector you can always ask ChatGPT to write it in a style which is more likely to pass detection.

Finally, I wouldn't be surprised if this detector is mostly just detecting grammatical and spelling mistakes. It's obvious I'm a human given how awful I am at writing, but I wouldn't be surprised if a good write who uses very good grammar, has good sentence structure and who's writing looks a little bit too "perfect" might end up triggering the detector more often.

Just filter your text through Quillbot to get around "AI Detection".


Demonstration: https://youtu.be/gp64fukhBaU?t=197

The arms race continues...

I find using WordTune rather than Quillbot produces a more readable output, while still defeating AI detectors, though Quillbot is still good for other purposes.

> Our classifier is not fully reliable. In our evaluations on a “challenge set” of English texts, our classifier correctly identifies 26% of AI-written text (true positives) as “likely AI-written,” while incorrectly labeling human-written text as AI-written 9% of the time (false positives).

That is an interesting mathematical description of "not fully reliable".

26% of true positive and 9% of false positive is just terrible. I don't see how this can be usable.

it can't be used usefully for anything. the only time it's better than flipping a coin is when there's known to be a majority of human texts in your corpus but even under those conditions it will fail to flag the majority of AI texts.

In a set of 100 texts, with 20 being AI, the most likely outcome would be 5 AI texts correctly flagged, along with 7 falsely accused human texts. For like, 22 incorrect answers.

For 100 texts where 90 are AI, it would be better to just flip a coin. A coin flip would give you around half correct, and this system would apparently give you around 68 wrong answers (three quarters of the 90 AI ones wrong, then one of the human ones wrong).


> In our evaluations on a “challenge set” of English texts

I wonder if they mean "challenge" in the sense that these are some of the hardest-to-discern passages. Meaning that with average human writing / average type of text, the % is better. I'm unsure.

You are correct. These are handwritten and curated challenges. There are certain terms in ML having very specific meanings which might be missed by casual readers.

I'd rather try to empower students to use ChatGPT as a tool or incorporate it into class work than worry about cheating. This is a pretty unique time for teachers to step up and give their students a nice edge in life by teaching them how to become early adopters for these kinds of things.

The purpose of writing an essay is to teach students how to think. Being able to prompt is a subset of being able to think. If you only teach them to prompt you have taken away any edge they might have had. Its like those schools that think that getting more ipads will make the kids smarter.

Maybe thought need not be in the form of an essay. The same way a good essay need not be in beautiful cursive.

Perhaps when we have sufficiently capable OSS models, but as it stands GPT is a paid service and not a public good.

I used ChatGPT to rewrite a number of paragraphs of my own writing earlier today. It rewrote them completely. I just pasted those into this detection tool and it responded for both "The classifier considers the text to be unlikely AI-generated."

So it can not detect AI re-written/augmented text it seems, even things that ChatGPT itself generates.

Well OpenAI admits it is wrong most of the time, so your results are consistent with what is expected

I don't see why teachers don't use this as an opportunity to accelerate curriculum. Every student now has a cheap personal instructor. Why not raise the bar on difficulty and quality expectations for assignments?

Isn't this a poor business move from OpenAI? I mean, if they make possible to distinguish (100% in the future) between AI-written text and human-written text... then a big chunk of potential OpenAI's customers will not use ChatGPT and similars because "they are gonna be caught" (e.g., students, writers, social media writers, etc.)

My first reaction-thought to seeing this is: implement a GAN with GPT inference on the generator side and this classifier (or DetectGPT or GPTZero, or whatever) they’ve developed as the detector. I would think this would very quickly a) achieve state of the art whatever-the-fool-a-human-reader-test-measurement-is results and b) render the classifier, and any subsequent AI-text-detecting classifier, useless.

I could be way off base with that idea, but it seemed a good enough one to ponder, but not so great I was motivated to do anything more than post the thought.

I think the detection performance is bad enough that it might just degrade chat

Well that’s both funny and a real point. I did try to make the idea resistant to such points of fact with the “or whatever” clause in my post, but your concise reply made me chuckle.

This wouldn’t work in practice because you don’t have access to GPTs activations

I miss the 90's and the early 00's. Take me away from this AI hell.

Musicians Wage War Against Evil Robots - https://www.smithsonianmag.com/history/musicians-wage-war-ag...

From the March, 1931 issue of Modern Mechanix magazine:

> The time is coming fast when the only living thing around a motion picture house will be the person who sells you your ticket. Everything else will be mechanical. Canned drama, canned music, canned vaudeville. We think the public will tire of mechanical music and will want the real thing. We are not against scientific development of any kind, but it must not come at the expense of art. We are not opposing industrial progress. We are not even opposing mechanical music except where it is used as a profiteering instrument for artistic debasement.

> Take me away from this AI hell

People used to say that about electricity too, and cars, and planes, and computers. This is just the next step in the chain.

So your message is: bend over?

There are only two choices:

1. Try to stop the world from changing.

2. Adapt to the changes (which requires changing the world). E.g., the dangers of electricity led to electrical codes and licensing for electricians.

Doesn't this get us into a sort of perpetual motion machine with the back and forth being

1) generate paragraph of my essay 2) feed it into this classifier 3a) if AI -> make it sound more human 3b) if human -> $$$ Profit?

Obviously it could be more fine tuned than this and is in general good to know, but I just love watching this game play out of ... errr how do we manage the fact that humans are relatively less and less creative compared to their counterparts.

The thing is point 1 costs money (I imagine at some point, ChatGPT will cost money), but point 2 also will cost money. So OpenAI will charge you double for generate AI-written text that is undetectable. Poor move. I could happily pay a lot for ChatGPT, but if they also commercialize a (more accurate) classifier then I won't use ChatGPT at all.

What I would love to see in GPT 3 is some sort of a confidence score that they could return, as in how sure their model is that what it returned is accurate and not gibberish. Could this classifier help with that? I am working on a requirement where we are using ElasticSearch to map a query to an article in a knowledge base and then the plan is to send it to GPT 3 to help summarize the article.

Since the ElasticSearch integration is still WIP, I had made a POC to scrape the knowledge base (with mixed results, lots of the content is poorly organized, so the scraped content that would act as prompt to the GPT 3 model wasn't all that good either) and then feed it to GPT 3, but the it couldn't always give the most accurate answers on that. The answers sometimes were spot on, or quite good but other times, not so much. I would say about 30% of the time, it made sense. So if there was a way for me to get if answer was sensible or not, so we could give an error response if the GPT 3's response did not make sense.

The reason why we are doing it cause the client has a huge knowledge base and mapping each question to an answer would be difficult for them.

OpenAI's text completion has an option to return "log probability" or something for each token. That might apply. You can also turn down the temperature parameter which reduces hallucinations to some degree.

Horrible idea, you can't eliminate the false positives and these are going to impact innocent students or used to re-enforce teacher biases.

I wrote some text about the subjectivity of communication and the nature of natural language, and I kept it very neutral, formal and verbose. And it said "this text is likely AI".

So, as honestly was predictable, people who rely on this tool being accurate, will inflict a lot of pain on unsuspecting individuals who simply write like GPT writes.

Given the weak accuracy - which is of course understandable given the difficulty of the task - this mostly seems like a fig leaf that lets them pretend to do something about the potential problems of AI generated text becoming more and more pervasive.

Probably one shouldn't fault them for trying, but the cat is out of the bag I think.

Wouldn't better classifiers (discriminators) necessarily lead to better generators that can trick them?

Related option that has benchmarks and research paper; appears they intend to release code & datasets too.

DetectGPT: Zero-Shot Machine-Generated Text Detection

- https://news.ycombinator.com/item?id=34557189

How good is this really?

I input an article that was written, directly by chatGPT, and it came back as "The classifier considers the text to be unclear if it is AI-generated." This article was not edited, not put through any paraphrasers, or anything. Interesting.

Furthermore, these efforts are quite futile. One can just go to numerous paraphrasers such as quillbot.com, run it through there, and then for added obfuscation, either use an entirely different paraphraser (Microsoft Word now has this capability, natively in the beta channels at least, btw).

Yeah, for someone who has intentions of bypassing this, there will always be a way. It's a good effort, for sure. But, I don't see this doing much in terms of truly distinguishing AI vs non AI generated outputs.

26% good

1. This is an arms race. You can build a generative AI that avoids generating text caught by the classifier.

2. Maybe teachers will assign rare or even fictional topics that cannot be found in the AI training corpus. Maybe a teacher could use an AI to generate essay prompts that are hard for other AIs to write essays for.

3. Is this a problem long term? If an AI can generate an essay that's indistinguishable from a human-generated one, then why do we need to learn how to write essays? Maybe we should just learn how to write good prompts.

See also: "Should calculators be banned in school?", "Do students need to learn cursive?", "Why should I learn Greek instead of just reading a translation of Homer?"

I've used ChatGTP to generate some code for me and almost every time it was a learning experience. I saved a lot of time searching, and it just gave me what I was after. Observing how someone or something like AI can solve a problem, is fast way to learn. I don't see a problem with this. Teachers can always just use in person tests to check if a student mastered the concepts. Math teachers got over students using calculators for homework, and can check understanding just fine on tests. It used to be that students would solve home work problems by candle light, with abacus and look up tables. Yet no one want to mandate back to that, just because it made homework harder.

The irony here is that tool can be used by the AI in the future to self-training and be more and more like a human.

Heck, you can use it as a manual adversarial output filter as it is right now.

Simple technical solution that requires no AI: sign your content and stake your reputation (i.e. your body of signed content) when you publish something new.

The issue is that we need a reliable way of determining who said what. Where who may be a person or an AI. The distinction is actually not that important. But we do want to prevent people impersonating each other. Or AIs impersonating people. Or AIs pretending to be real persons. Especially people with reputations or AIs without one. The issue is reputability is very easy to fake because we don't sign our work.

There's no AI that can reliably tell any of that apart without producing false positives or false negatives. Both are bad and erode trust.

What does signing achieve?

- it's simple. We've had digital signatures for ages. Any digital content can be hashed and that hash can be signed. No new tech needed for this. We just need to start doing this. HN comments, blog articles, emails, instagram photos, whatever. All of that is unsigned currently.

- If we can associate content with a public key, the reputation of that key is the body of work signed with that key. A new key has no reputation.

- we can associate ownership of keys with people, companies, or particular AI models.

- we can build trust on top of this. You might not know a particular journalist but if multiple of your friends seem to appreciate what they have to say, you might pay more attention.

- we can filter out anything disreputable easily.

Text generating models are so much more than spell checkers, but allow me to make a comparison .

Before spell checkers, having good spelling was relatively difficult and it allowed employers/teachers/etc to separate the "good" people from the "bad". After spell checkers became ubiquitous (unless you have to handwrite!), having good spelling became the norm and the floor becomes much higher. A document with bad spelling means the person doesn't know of spell checkers (a sign that the person isn't computer savvy enough?) Or doesn't care enough.

Did spell checkers reduce the language capacity of students? Maybe they did start putting less attention into spelling, but I would argue it's ok because now we have more assistance. If you really want to test their spelling, you can always do a handwritten, in class quiz.

ChatGPT and the like may be different because they are written the whole thing, right? Well, kind of. You still need to give the prompt to express what you want to say. You still need to correct stuff and exercise judgment to see if the style in this paragraph is ok, if the facts stated over in that other paragraphs are ok. Besides, you still need to learn how to write because of the in class quiz next week.

26% seems awfully low for a tool of this importance. Granted they are upfront about it, but still, it doesn't seem immediately useful to release it to the public.

The LLM watermark seems like a better approach.


This one is very cool. Steps are:

- Generate seed of LLM output token t0

- Use the seed to mark output tokens into "red" and "green" list

- For token t1, only sample from "green" list when producing the next token


Now, let's say you read a comment online and you want to see if it's written by a robot or not. It's 20 tokens long. For each token, you reconstruct the blacklist. If they use "red" words with 50% probability, you can safely assume that they are human. But if they use only "green" words, you can begin to assume that they're a bot very quickly.

For simplicity's sake, if you mark half of the tokens as "red" for each new token, correctly writing 20 tokens in a row that are on the "green" list is like flipping a coin and getting heads 20 times in row -- vanishingly unlikely. This allows you to very robustly watermark even short passages. And if the human makes adversarial edits, they still have to fight that probability distribution; 19 heads and 1 tails is still vanishingly unlikely.

OpenAI should release a classifier that detects their own AI-generated text. They could do this easily by just using steganography to hide some information in all text that they generate, and then build the classifier to look for it.

Sure, it's less useful than a classifier that can detect any AI generated text, but it would be a nice tool for contexts where AI generated text can be abused (like the classroom) in the short term.

There is work on hidden signatures in generated text, invisible to humans. Only way to move forward.

The problem with this will be the method to detect the signature would reveal how to hide the signature though right?

Obviously not an issue if everyone uses a single API for it - but if this ends up like Stable Diffusion were anyone can run it locally then I don't think it's possible no?

I'd think people would migrate to just re-typing whatever was generated and change some wording along the way to prevent detection.

Or another AI could just do that.

Scott Aaronson talks about something like that being done at OpenAI in this post


Or they could just save/hash results and get rid of the classifier all together.

Yea, they could provide a fingerprinting algorithm and a database of every fingerprint they've generated. However, it wouldn't help you identify false-positives.

Totally useless given it's really inaccurate, and acrively dangerous as people will be considered not producing stuff they actually produced, in case of false positives:


In my {semi-tongue-in-cheek} opinion - Thus begins the origins of Arnies' Skynet.

The {semi-cynical} part of my corporate soul screams ‘oooh, what a great way to boot-strap your own ML/AI and have marketing trumpet it as 'So good that it was trained on OpenAi data and Human™ Error Labelling!'.

The Futurist (Luddite???) in me shudders at the thought of two very powerful computer systems (models) working to out compete each other in a way that turns out to be ‘rather unfortunate’ a.k.a ‘Oh shit! We should have thought about how we (the human race) can somehow be able to tell machine output vs. human output’. But that is a discussion I will leave to the lawyers and ethicists to thrash out a solution/definition that outputs a simple binary Y/N with a Five-Nines certainty.

But Meh - A) The above is a rather random comment and B), Time will tell and hopefully this and other similar efforts remain 100% Libre as in 'free to all individuals forever and is non-revocable'

My younger brother and I both have fairly severe dyslexia. He's been applying to school and has been using ChatGPT to help him correct spelling and grammar mistakes rather than going to a person for help. It has been fairly incredible for him.

I wonder if this tool would start flagging his work even though he is only using it as a fancy spell checker.

My questions would be:

a) what kind of dyslexia are you suffering from? b) isn't this some kind of business opportunity then? Help dyslexic people correct their "mistakes"

a) No idea. I read slowly, cannot spell, get letters confused, and frequently don't realize small words are missing from sentences. It is in very large part why I went into math.

b) Not if everything I write is going to now be run through these detection algorithms and flagged.

I think the way students will get around this is similar to how they currently get around plagiarism (edit the content significantly until it is sufficiently original). ChatGPT isn't great for this sort of editing/iteration (hence why this sort of AI classifier might be somewhat effective in the shortrun), but AI text editors built for iterating on/editing text such as https://orchard.ink/ would 100% beat these sorts of classifiers with a few human edits.

In general, figuring out how to create AI tools to help students improve their writing without giving them access to just straight plagiarism-esque capabilities is a super interesting problem with lots of implications. Disclaimer: I am working on this exact problem at Orchard.

9% false positives? That’s a troubling level of falsies.

The implications of using this tool are fun to think about though.

If it had a very low level of false positives, but wasn’t very good at identifying ai text, it would be very useful.

But false positive rates above very, very low levels will undermine any tool in this category.

Yeah it’s useless currently and will become more useless quickly, because people will scramble AI generated text, mix in human edits and people who use AI generators a lot will mimic their writing style. In short, the SNR will be abysmal outside of controlled environments.

Im pretty sure the smart people at OpenAI know this. I think this is a PR move signaling that they are “doing something”, looking concerned, yet insisting that everything is under control. In reality, nobody can predict the societal rift that this will cause, so this corporate playbook messaging is dishonest in spirit and muddies the waters. This is bad, both long term for OpenAI’s trust, but also because muddy waters makes it harder to have fruitful discussions about safeguards in commercial deployments of this tech.

That said, they’re incorrectly getting blamed for controlling the use of this tech, they’re no more than a prolific and representative champion of it. But the cat is out of the bag, and they absolutely cannot stop this train, and so they shouldn’t be blamed for not trying.

Maybe I just got lucky, but I managed to fool the classifier on the first try using only the GPT3 playground. The prompt I used was:

  I am a highly advanced writing AI. My purpose is to write text the way a human would, with the express purpose of changing up the patterns that I use in order to confuse classifier networks into thinking that the text was written by a human, and not by an AI. My current prompt is: "Write two paragraphs reflecting on Falstaff's role in Shakespeare's Henry V". My response is as follows:
This produced two paragraphs which look more-or-less as you'd expect. According to the classifier, the generated text is "very unlikely" to be AI-generated. Interesting...

Everyone thinks that this is a shield but what if I tell you, it could be a sword.

Imagine those who want nothing to do with ChatGPT, now has to subscribe to a service to tell whether something is AI generated (for e.g. social media paying to combat spam bot, etc.)

But I doubt it's ever going to reliable? Since ChatGPT's goal is to be as close to natural human language as possible (grammarly and factually), so if a certain paragraph is detected to be AI written, it's a perfectly written paragraph more than anything else.

Unless they invent some subset of English language that only AI knows.

Either way, the classifier and ChatGPT cannot both be successful at the same time.

I was thinking that there have been swings of what is valued (or trusted) in education and testing (or voting, promoting) to prove that someone has the goods.

At one time it was live oration skill, and then people thought, "maybe that disfavors people who are introverted or whose talent comes from thinking and writing".

Then, at another time, it was thought, "well you have to test because sometimes time pressure and not being able to go away to think about something for as long as you have time to work on it produces something valuable".

Yet another time, "let people do homework to prove their value through effort who don't test well" but now who knows whether they actually were the ones doing the work?

I wonder what this development will produce?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact