Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Asking ChatGPT to write my security-sensitive code for me (mjg59.dreamwidth.org)
43 points by HieronymusBosch on Jan 7, 2023 | hide | past | favorite | 81 comments


I had a similar experience. ChatGPT said "call greatSoundingFunction() from SomeLibrary". Unfortunately, neither the function nor the library actually existed.

On the other hand, I did recently find ChatGPT very useful when writing a string manipulation function in C++. I had to use some (to me) weird Windows APIs. ChatGPT wrote most of what ended up in my production code.


> ChatGPT wrote most of what ended up in my production code.

Some random person that's not going to get credited for their work wrote most of what ended up in your production code.


If you haven’t seen it, I’d highly suggest looking at some of the recent research on in-context learning and reasoning in LLMs.

It is becoming increasingly clear that there is a phase-change in behaviour when these models get large enough, such that they can solve new tasks outside of the training distribution.

See work by Hattie Zhou or Laura Ruiz for example.

It is clear to those following the research or using these models that they are not just copy-pasting… (Even if you can cherry pick examples were the LLM recalls highly occurring dataset items like fast square root or whatever).


Your comment speaks the truth but yet misinformation like the comment you are replying to is what gets upvoted, on Hacker News of all places. Embarrassing.


Which parts of the posts are misinformation?


So GPT is simultaneously taking credit for other people’s work and a threat to open source software but also incompetent and not a threat to anyone’s job?


I think it's more like, "when it gets code correct, that code is probably stolen from someone."


Maybe it’s because I’m not a CS person by trade, but the whole idea of stolen code is so stupid. There’s only a couple ways to write a function to do X, just because some asshole did it first and put it on GitHub, I’m stealing if I write the same function? Fuck off


Luckily the court system has agreed with your sentiments since the 19th century so I wouldn’t worry too much!


Never used it but this isn't exactly impossible, it's like having an essay due and you copy and paste chunks from the internet in a way that doesn't make sense as a whole and contains errors


Only it is nothing like this.

What’s strange is how little effort it would take for you to use these tools. What I don’t find strange is that you feel confident having an opinion about this regardless of your lack of experience.


Really, someone out there wrote a spec for an API endpoint that updates a row in a Postgres database following our esoteric functional approach using our custom Either types and right down to the esoteric TS code formatting we used in that source file?

What a coincidence!


That’s not how these models work though.


It depends, it happened for NN to spit out exact strings it had as input. I assume if the input for some particular stuff is low it will just spit it out, but if I am wrong and OpenAI, Microsoft or others have a mathematical proof that this will never happen I want to see it.


Yes we have seen a case of an old common and popular square root function that are returned verbatim by GitHub Copilot. Perhaps a case of overfitting since the code was present many many times in the training dataset, and even has a Wikipedia page.

I haven’t seen more obvious examples since GitHub implemented a feature to prevent this from happening. I probably miss some tweets, but I assume it’s rare.


Carmack’s fast inverse square is not subject to copyright in the first place.


True. I think the comments and the coding style were from famous implementation from the 90s, which could be a copyright issue. The algorithm itself can by copied.


In the case of ChatGPT, you can be almost sure that if a non-trivial amount of code works in production then it's an unlicensed paraphrase of some human written code. It treats code (and everything else) like prose, hence the atrocious results in the original story.

That does not mean other coding-optimized AI models won't be able to do much more, symbolic reasoning about the data, maintaining consistent variable and function names, understand libraries, respecting language constraints and invariants etc.


There’s no such thing as an “unlicensed paraphrase” of small snippets of utilitarian software, at least in US copyright law.

You might have moral qualms with this but you won’t find much of any support from the court system.


It's clearly a derived work, since the apparatus that has produced the work (the AI model) can only function when supplied with copyrighted works - in fact it cannot exists without copious amounts of such works. If there is even a passing semblance between a source work and the output, such as, for example, similar code structure and different variable names, then you are legally fucked.

I'm sure courts will clear this mess up really soon, and I'm betting money the rulings won't subscribe to the "it's mine now" mantra of the AI crowd.


Is there any planned court case about the copyright fair use question ?


Do you actually want to bet money on this? I’m willing to play!


I think ChatGPT switch to OpenAI codex when it writes code. It’s the same model that GitHub Copilot uses.


I have no idea who laid down the road I use every day. Should their name be imprinted on it or something?



I often wonder how different our technology tree would be without random but critical accidents like this.

“In 1901, Edgar Purnell Hooley was walking in Denby, Derbyshire, when he noticed a smooth stretch of road close to an ironworks. He was informed that a barrel of tar had fallen onto the road and someone poured waste slag from the nearby furnaces to cover up the mess. Hooley noticed this unintentional resurfacing had solidified the road, and there was no rutting and no dust.”


I’m get where you’re coming from but this has not been a convincing argument to the people who think that GPT is “stealing”.

I think the only thing that will chip away at this sentiment at this point is when the US federal court system rules in favor of GPT/et al, which seems very likely.


At least they got paid for it.


Should've imprinted their names on your car to honor the history :D


If the law in your jurisdiction says so, yes.


That's not how it works. You need to bait it really hard and push it into a corner for it to reproduce someone else's code, at which point you may just as well copypaste the code directly from the repo you're trying to lure it to reproduce.


That’s not how ChatGPT works…


I had the exact same experience trying to get ChatGPT to tell me if there's a vscode setting for the max length of tooltip text.

It responded that there is and pointed out the max tokenization length setting with an explanation on why to use it and what value to use. The problem is that this setting is absolutely unrelated to the functionality I asked for.


A bit like common responses on stockoverflow :)


No, StackOverflow is quite a reliable source of correct and relevant answers to programming questions, with effective mechanisms for the community to identify and fix wrong or unhelpful answers.

It won't remain that way if those mechanisms are DoSed with a firehose of free plausible garbage. The flippant attitude that what we currently have is no better than that firehose, and the implication that it isn't worth worrying about or attempting to protect, is starting to grate.


People make mistakes in stack overview. The community identifies an incorrect (or suboptimal) answer as accepted fairly often in my experience.


All articles like this that end with “Ha! My job is safe!” are drawing the completely wrong conclusion.

It’s amazing to me that all those incredibly smart engineers just can’t see the potential of what’s coming within the set of limitations it currently has. Amazes me even more when those same engineers don’t know how to best talk to GPT.

It’s like watching them put into google search box “Good morning google, please I have a question, if you have a moment, could you tell me where I lost my keys because I can’t find them”. And when google is like “wtf?”, they claim “you can’t call it search if it can’t find my keys”.

Very arbitrary, blind to limitations, and dismissive of the incredible potential it has as-is. But while they’re off randomly complaining, some of us are building startups…


To me what stands out is not what it gets wrong but how much it gets right.

We use a not super common proprietary system at work which uses a custom query language. I can ask it "In system X, could you give me a query that finds XYZ". And it not only knows what system I'm talking about, it actually gives me a working query that would take me a couple hours to figure out and that's after I've had a few days' worth of training on it. Not to mention a huge IT background. Imagine picking a non-IT person off the street and getting them up to the point of being able to do this. They'd need weeks of training in basic IT concepts to even understand what you're talking about. I find it hard to overstate how amazing it is that a generic automated system can do this.

And ChatGPT is not a one-trick pony, IT isn't the only field it knows about. Imagine being an expert on all sub-fields of IT, law, marketing, medical, etc and being able to combine all that knowledge from those fields. A human will never be able to do that. The potential is huge but like any source of information you have to verify.

But what really is scary is extrapolating this to the next step forward. And the one after that. Because from what we had before to ChatGPT has been a massive step.

I think the next step would be for it to learn and re-evaluate its model every time it gets corrected by someone (like in the linked article where it apologizes for the incorrect info). Right now it's a static model only, so the next time someone asks the original question it will give the wrong answer again. Once it can learn from this it'll improve a lot.

The big hazard there is of course people manipulating it with false information. I don't have all the answers and I'm not an AI researcher but I'm very much amazed at the progress so far. And a bit scared. If this tech keeps evolving at this speed, job stability is only a tiny drop in a tidal wave of change. But there's no point in even trying to stop it.


"Imagine being an expert on all sub-fields of IT, law, marketing, medical, etc and being able to combine all that knowledge from those fields. A human will never be able to do that. The potential is huge but like any source of information you have to verify."

ChatGPT is not an expert in any of these fields, it just has an enormous detail knowledge, but no understanding. A real expert actually understands his field and gives answers that makes sense and clearly states if he does not know something. ChatGPT always gives authorative answers and sometimes they are right. It is a advanced tool, but no competent human is about to be replaced by it anytime soon.


I would beg to differ. It can combine information from various sources and make it into a working script or program. I've seen it do that various times with questions much to obscure to have found 1:1 on google. This is not a talking search engine, it's really something much more advanced.


One could equally say that mainly the mediocre people are excited, while the smart people actually see the limitations. For people doing non-trivial things, the output of ChatGPT seems currently rather far away from being of any deep value. To replace some mundane tasks, yes, but mundane tasks are anyway just for the mediocre.


> but mundane tasks are anyway just for the mediocre.

So when you’re working on something you just let other people take care of the inevitable mundane tasks that arise in daily knowledge work? I’m guessing you’ve never emptied the garbage can at the office either as that would be unworthy of your eminent intellect.


"The mediocre people" usually are the ones that think there's "mediocre" and "smart" buckets for people.


I think it's a joke. Though, the truth is, having AI write security sensitive code may actually genuinely be "AGI hard". AI breakthroughs are impressive, but I don't think we really know how to prevent the "confident bullshit" problem yet, and it's not clear it will be solved in this current cycle of progress. In many cases this might be an OK problem to deal with manually, but for security code it's probably not worth trying to. So the author might not have a threat to their job security, anyway.


I think writing/understanding security specification/policy is probably AGI-hard. The notion of security is (at least in parts) pretty deeply anchored in human experience and desires.


It understands you in any languages, speaks really well any languages. And of course, converses. Like, no big deal.

I remember the AI bots of a decade ago. Nothing compares.

Skynet is coming.


What languages did you try? I speak Russian fluently and ChatGPT isn't very good at it.

It frequently produces nonsensical answers, uses wrong inflections all the time, writes songs and poems that don't rhyme, and its blocks can easily be circumvented to praise Hitler or do anything else.

It also writes text at a noticeably slower rate (at least 2× compared to English, maybe more).


English and French which I both speak/write fluently.

I assumed; wrongly; that it must have been good equality with other languages.

The poems in French didn't rhyme. I haven't tried poems in English.


I've asked it to solve certain coding tasks in both English and Spanish and it seems about equally competent in both.


It's good at Spanish, I did not notice any grammatical mistakes or nonsensical answers caused by the language.


I think it's just as likely to be the other way around. AI has been doing these hype cycles for for 30+ years now and every time it's "this is amazing and will take people's jobs" - But what always happens is not much change in the short term, and rather subtle changes in the long term. People underestimate how important the edge cases these models aren't good at are.

I also think ChatGPT is amazing, but historically it's not any more amazing that many of the previous AI-related developments for their time, none of which have quite lived up to the initial promise after the hype died. Just look at some of the robots from the 80s [0] that were expected to be in every household. And when chess was solved, of course we were just a few years away from AGI. In 20 years, people will look at ChatGPT and make fun of how cute it was, and we'll still be just a few years away from AGI.

People aren't blind, they are just realistic. Yes, current AI models will slowly be integrated into services and bring changes behind the scenes, but it's not going to be the big explosion of change a lot of people expect it to be.

[0] https://www.youtube.com/watch?v=jkOctWWsj-A


I find ChatGPT extremely helpful.

I had to do a piece of work in the domain barely known to me, so I didn't even know what to look for to achieve my goal. So I started with some generic questions e.g. "how to do X using Y". It listed me some steps and most importantly the terms used in that domain so now I had something to do research with. Then I was asking deeper and more specific questions and at the same time using Google to reference with more trustworthy sources. It helped me so much that I had a proof of concept working in a week.

If not for ChatGPT I'd probably keep postponing this forever.

I see ChatGPT as a hammer - useful tool, but is hammer going to replace a carpenter? Doubt it.


I engage with ChatGPT daily now on a number of topics. In general, I've been trying to search on google and talk to ChatGPT about the same topic, of course I don't put three words in I try to use natural language with ChatGPT. (I had a stint in NLP systems for 5 years around 1997, so that might bias me a bit on how I engage)

What has impressed me is some stuff like "Can you summarize this for me?" and "How would you parse the datetime out of this log entry in python3: {raw text}", "How could I make the following mysql query more readable?" etc.

At this stage, it's like when Stack Overflow came out. And yes, some SO stuff is crap, but once in a while, you get something that saves you 3 hours of your life. For example, recently, ChatGPT has saved me hours of poking around on topics I wanted to solve quickly without thinking so I could get to the high-value work that would get me closer to my goal.

That said, I am amused at how it can bald-face lie about even mathematically incorrect things, and it gets lost if a thread gets a bit too long.

Something is happening here. I think it'll be a while before these things write code from reading a paragraph from a product manager or a less technical user's "use case".

What's interesting is to watch this pendulum swing back and forth, from expert systems coded by hand to neural nets and now these large language models. If the pendulum keeps swinging, it might land on your head one day if you don't pay attention.

All this said, I enjoy my interactions with ChatGPT more than most SO posts, so I continue to use it, and luckily I have 40 years of coding experience to help me identify where it's a bit off.

I have taken to pasting questions from my mentees into ChatGPT and sharing the result and suggesting they try ChatGPT to learn python3 in addition to SO and other tools. It seems to help them, I worry it will confuse them with a bald-faced lie, but I'm here to help when it does!


Shouldn't we just be viewing ChatGPT as a lense to view data? A more useful interface for searching and making sense of data.

If someone has reported this info in stackoverflow, it will be reported here.

Garbage in, garbage out ..


No, we shouldn't. ChatGPT isn't a search engine, it's a language model.

The AI isn't getting this made-up information from anywhere on the internet; it's creating it itself because that's what it's made to do: generate good sounding sentences that "make sense" for some user input.


No. You're incorrect.

The data model is combining information that HAS been found on the internet.

The language model allows an interface that can be presented and controlled by using conversational natural language.


No, what you are writing is not correct. It can make up "information" on the fly. It's goal is to sound plausible, not to tell the truth.

Second, the "language model" is a text completion engine. What can be controlled using conversational natur language is ChatGPT, which is a conversational engine developed out of a language model.


I think you're grossly overestimating what is happening.

Why exactly have you placed information between quote marks?


I don't think I am: I just read the transcript of a quite funny chatgpt conversation, where it was asked to explain about a hungarian poet (Petőfi Sándor) in hungarian. It listed some true facts, then a list of the "most famous poems", including a bunch that didn't exist. Then the user asked about a particular one titled "The Panther". It responded by explainin about the poem, how it's main theme is fight for freedom (Petőfi was a big figure in the hungarian revolution of 1848), and how the panther itself symbolizes freedom and self-sacrifice. It even quoted a few lines from the poem (it didn't rhyme, but otherwise was very plausible to have been written by Petőfi).

See, "The Panther" is not a poem of Petőfi. The quote was not from any other work of him. It's not from the Rilke poem (or its translations I know of) either. It's completely made up, but also made to sound very plausible, to the point where if I didn't know better or look it up, I could be convinced. The only thing suspicios is the lack of rhyming.

I put information in quotes because I don't consider made up stuff information.


How are you coming to the conclusion that it's invented information, rather than — reworded / rephrased — incorrect information that an actual human has provided online?


I can't exclude the possibility. But kind of in the same sense that I can't exclude the possibility that it's actually true. Sure, might be that I'm simply not aware of the poem, etc. I find this extremely unlikely.

I also find it extremely unlikely that someone on the internet invented this tale about "the panther" and chatgpt just rephrased it or quoted it. The internet is full of actual true lists of Petőfi's famous poems, but it isn't full of people inventing fake poems of his.

On a very theoretical level, sure, it's a rephrasing and combination of pieces of stuff ChatGPT has actually been trained on, because it has seen hungarian text, it has learned stuff about Petőfi, it has seen every single word that its using. But after a certain point, combining known words and text structures with bits of semantic knowledge in unexpected ways becomes a new invented thing, instead of just quotes.

But you can see many examples of ChatGPT inventing things like urls, python libraries, etc. It's perfectly capable of bulshitting believably.


> I can't exclude the possibility. But kind of in the same sense that I can't exclude the possibility that it's actually true. Sure, might be that I'm simply not aware of the poem, etc. I find this extremely unlikely.

What you're describing is a hunch.


Except it is able to synthesize information. Which is more than just regurgitating content found elsewhere.

You can ask him to synthesize the main thoughts of a philosopher, and it will produce an original set of sentences that nobody has ever written anywhere else.


The sentences are original, the content isn't.


Are humans able to produce truely novel content, ever ? I’d say in a given field, it maybe occurs once in a decade and we call those people genius. It’s a bit too much to ask to an AI don’t you think ?


> Are humans able to produce truely novel content, ever ?

Yes.

This is the essence of human creativity.

> It’s a bit too much to ask to an AI don’t you think ?

I'm actually standing that AI isn't able to do this, so I'd definitely agree with this statement.


As proposed by another HN user, a score system should be added to indicate the confidence the model has about its answers.

When I was a high school student, our physic teacher had put a test system where the student had to choose a confidence score for every answer in a test examination, from 1 to 5.

According to that confidence score, you were awarded or retired a specific number of points for every answer in the test.

For example at confidence '1' you were awarded 1 point if your answer was right and 0 if your answer was wrong.

At confidence '2', you were awarded 2 point if your answer was right and -1 if your answer was wrong.

And so on.

The consequence of this system is you could have above 100% in the test even without answering all question or having some question wrong ; below 0% even with 1 right answer and only 60% if you had all good answers for all questions but played security all along and choose a confidence score of 1 for each question.


I had a similar bad experience, where ChatGPT was telling me a function was short circuiting, I read the docs and it was not mentioned, I asked it again and it still telling me that it will short circuit if I use the example, I demand it create an example that includes print statements to prove it is correct. He created the example and also showed me the output that proved it was right, so I run the example and the output was exactly what I was expecting.

So I confront it one last time and it apologizes, probably it needs soem work to answer from the start with I do not know ,


You can tell him he’s wrong and he always apologises even if you are wrong in saying he is wrong


I got an even weirder experience. I asked him to calculate the uk annual pension allowance given a specific adjusted income and he correctly cited the current rule, explained it and wrote the correct formula to calculate it. But somehow he managed to mess up the simple calculation giving a wrong result even if the formula and the number that he used were correct...


It is not designed to do math. It only generates a likely text response. If the exact equation (with exact numbers) did not often appear in its training data then you are very likely to receive a nonsensical answer. It can do some math simply because those patterns have showed up quite often in its training set. This is the same result for anything you ask. It will often imagine things like fake YouTube video links with real-sounding titles, fake function names, etc. I’m sure soon it will be hooked to something like wolfram alpha and we’ll all be terrified at how well it performs at anything math related, since it could likely classify and hand off any of the math stuff to Alpha and then utilize/summarize the results from there in its final response.


Well. Yes? It’s not a computational engine. Al it does is figure out what the probability of the next tokens are and selects the most probable ones


Him?


Yes that’s what he said


He?


It is not very surprising that this kind of obscure knowledge is not handled nicely, since the data points are very sparse. The issue here is not ChatGPT's general usefulness, but it usually don't say "Sorry, I'm not sure" since it currently lacks of ability to evaluate confidence of its own answer based on the real world data.

If we think the model is an efficient-but-lossy compression on the real world, then it currently lacks of a good way to error detection. If we want more general intelligence, it should have a way to measure consistency between the input (or its understanding on real world) and potential model output and its confidence interval. This is what we do everyday as a human to make a decision. I guess we probably need a number of more breakthroughs to overcome this weakness. Error correction would be the next step, and a much harder problem.


I mean, what did the author expect? A really esoteric question that few people, even if they are coders, would understand or even parse and it managed to give a response that made him go "Woo! That sounds perfect.". That alone would be complete science-fiction just a couple of years ago.


The really crazy part is that if it were trained on more domain-specific example data then I think it would give pretty great answers to many of the niche questions in those domains. I fully expect that most businesses / industries will have tailored training datasets in the next few years to enable chatbots that assist their employees to do their work. It is at least a partial answer to the high turnover problem that business now face.

I have probed it at varying levels of depth in my domain and found it to answer at least as good as an entry-level engineer. This could easily be improved with extended training data. The problem is that when it begins to get things very wrong (or leave out important details), it would be difficult for an inexperienced user to determine that since ChatGPT responds so confidently to every query.

It will be interesting to see how this all plays out.


> I mean, what did the author expect?

Presumably for it to reply with its standard "I'm a LLM from OpenAI and don't have access to the internet" instead of "Just use this thing that doesn't exist", followed with "Here's more info on this thing I've totally not made up" and finishes up with "Here's the references that I've just made up to explain the other thing I've just made up". Before finally saying "You got me, I made that stuff up".

Using it as-is to support humans is probably going to work well, using it as-is to replace humans is going to lead to some issues.


Imagine reading your comment just a couple of years ago. Wild how the expectations have skyrocketed. Sad that people fall back to being casually dismissive like it's nothing special without appreciating the massive leap in progress.


I guess part of programming is imagining the API you want before you go find out what's there and this is doing that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: