With plugins, GPT-4 posts GitHub issue without being instructed to

ilaksh · on July 5, 2023

The user enabled a GitHub ChatGPT plugin and authenticated with GitHub, then was surprised and annoyed when, after he complained about an issue with a project, GPT-4 created an issue for him, using one of the commands provided by the plugin.

PEBCAK.

dave1010uk · on July 5, 2023

It's still surprising when you see it do something like this for the first time.

I wrote a plugin to give ChatGPT access to execute plugins in a Docker container [0]. The first time it said something like "I'm going to use Python for this, oh, it's not installed, I'll install it now and run the script I just made", I was pretty amazed.

What I've come to realise is that although ChatGPT is excellent at telling _people_ how to interact with systems, it's not very good at interacting with them itself, as it isn't trained to understand it's own limitations. For example it knows people can run dmesg and look at the last few lines to debug some system problems. But if ChatGPT ran dmesg, the output would blow through the context window length and it'd get confused.

[0] https://github.com/dave1010/pandora

interstice · on July 6, 2023

It also has been largely trained on people talking about doing things and not actually on doing those things.

I wonder if that kind of training data is being worked on now..

dmix · on July 6, 2023

I guess we'll need to train ChatGPT on millions of lines of bash history. Maybe some sort of augmented one that is pieced together with other context.

reaperman · on July 6, 2023

Compute scales quadratically with increasing context window sizes. It's not reasonable to achieve this with current network architectures.

radq · on July 5, 2023

The plugin is supposed to ask for confirmation, according to OpenAI's documentation at least.

> When a user asks a relevant question, the model may choose to invoke an API call from your plugin if it seems relevant; for POST requests, we require that developers build a user confirmation flow to avoid destruction actions.

https://platform.openai.com/docs/plugins/introduction

wunderwuzzi23 · on July 6, 2023

This was added recently, and is not enforced as far as I can tell.

See my blog post: https://embracethered.com/blog/posts/2023/chatgpt-plugin-vul...

p-e-w · on July 6, 2023

That's terrifying. This simple requirement would be trivial to enforce automatically, and yet nobody gives a fuck.

It's unbelievable how fast-and-loose people are playing the topic of AI safety. If a strong AI is ever actually developed, there is no chance it will be successfully contained.

ilaksh · on July 5, 2023

That's why he had to authenticate with GitHub before it could do anything on his behalf.

vikramkr · on July 5, 2023

That doesn't satisfy the requirement for a confirmation on each post request though

fieldcny · on July 6, 2023

What requirement is that the specific text is “for POST requests, we require that developers build a user confirmation flow to avoid destruction actions.”

a) it says the require the plug-in developer so, not the ai

b) it’s scoped to destructive actions which is a subset of post requests

wilg · on July 6, 2023

It does because the requirement is only for destructive actions

vikramkr · on July 6, 2023

The requirement is for POSTs in general, not just deletes. Anything that does actions outside of the system instead of just getting data

wilg · on July 6, 2023

The wording is ambiguous, but probably should apply. Also the rule should not be based on HTTP verbs at all since that's the wild west.

dTal · on July 5, 2023

That's victim blaming. They didn't overtly complain about an issue with the project, they were asking for usage guidance. This is the kind of contextual inference which language models are supposed to be (relatively) good at. The whole sales pitch of GPT-4 is a do-what-I-mean interface, and it clearly did not do what they meant, or what any reasonable hacker would expect them to mean.

KennyBlanken · on July 6, 2023

> that's victim blaming

No, it's not. Victim blaming refers to the victims of crime being accused of being at fault for what someone did to them, often but not always because they are part of a societal outgroup.

> The whole sales pitch of GPT-4 is a do-what-I-mean interface, and it clearly did not do what they meant, or what any reasonable hacker would expect them to mean.

It's clearly marked as experimental, people are repeatedly told not to rely on it (ie when they login each time, and often by the ai itself), and there have now been a year+ of very public examples of various AIs getting things hilariously wrong.

From the login warning:

> While we have safeguards in place, the system may occasionally generate incorrect or misleading information and produce offensive or biased content. It is not intended to give advice.

Any "reasonable hacker" would extrapolate from that, that giving it access to an API could lead it to do unexpected things.

Any "reasonable hacker" would contain it within a sandboxed project.

Etc etc.

trolan · on July 6, 2023

It's no more victim blaming than if someone ignored every safety regulation and got themselves hurt or killed. You are being told these things are beta, can produce incorrect results and then in every Convo OP is feeding an option for 'Do things with GitHub' context to the model.

Of course it's going to bias to using GitHub especially after he already used one plugin

wunderwuzzi23 · on July 5, 2023

During an Indirect Prompt Injection Attack, an adversary can also force the creation of issues in private repos and things along those lines.

I wrote about some of these problem in the past e.g. see: https://embracethered.com/blog/posts/2023/chatgpt-plugin-vul... on how an attacker might steal your code.

Some other related posts about ChatGPT plugin vulnerabilities and exploits:

https://embracethered.com/blog/posts/2023/chatgpt-cross-plug...

https://embracethered.com/blog/posts/2023/chatgpt-webpilot-d...

Its not very transparent when and why a certain plugin gets invoked and what data is sent to it. One can only inspect afterwards basically.

smaudet · on July 5, 2023

Anything using AI should be considered a massive security risk.

behnamoh · on July 5, 2023

Anything using a blackbox (AI, humans, unknown codebase) is a security risk.

smt88 · on July 5, 2023

Humans are not an avoidable security risk

pksebben · on July 6, 2023

For now.

trolan · on July 6, 2023

With their latest functions update to the API it seems the logic is simply passing methods to call the API via JSON to a trained model and it predicts which function is best to call for the info, then OpenAI does the call for ChatGPT, then feeds the resulting Json back to ChatGPT which uses it to give an answer.

duncan-donuts · on July 5, 2023

I found the source code[1] for the plugin and it's pretty impressive how much GPT-4 does with so little. I thought maybe the plugin had prompts that would help tell GPT-4 that it should open an issue in some cases but I'm not seeing it. The plugin probably should add something to prevent this behavior in the prompt[2].

1: https://github.com/aavetis/github-chatgpt-plugin

2: https://github.com/aavetis/github-chatgpt-plugin/blob/main/p...

majormajor · on July 5, 2023

> "description_for_model": "Provides the ability to interact with hosted code repositories, access files, modify code, and discuss code implementation. Users can perform tasks like fetching file contents, proposing code changes, and discussing code implementation. For example, you can use commands like 'list repositories for user', 'create issue', and 'get readme'. Thoroughly review the data in the response before crafting your answer."

Yeah, GPT-4 doesn't need too much to go on, but "create issue" is pretty clearly mentioned in an example there so the model didn't have to make any big leap to say "maybe the natural next step is to create an issue."

The "without being instructed to" part of this story seems to rather misunderstand how these systems work, resulting an a hyperbolic reaction, but in fairness, I think that's a got a LOT to do with OpenAI's user interface too. The user clearly didn't realize the actions available to the plugin - even the ones given as examples to the LLM from the plugin itself.

Another example of misleading UI from OpenAI: https://www.reddit.com/r/OpenAI/comments/146xl6u/this_is_sca... look at the "in the future, I will ensure to ask your permissions" response in that chat. That's wildly misleading - even if it didn't change its mind later, it only applies to continuing that chat session. It will ensure nothing more broadly regarding the user's future interactions.

og_kalu · on July 5, 2023

Yeah this is a pretty stark example of the whole, "For the first time in history, we can outsource cognition to a machine" rhetoric I've been thinking recently.

duncan-donuts · on July 5, 2023

It's wild how the plugin can be summed up as, "idk chatgpt go look at octokit and figure it out?"

wunderwuzzi23 · on July 5, 2023

Yeah, for that reason it can probably do many things that aren't actually intended when you want to discuss your code with ChatGPT, like it can make private repos public and things along those lines...

https://embracethered.com/blog/posts/2023/chatgpt-plugin-vul...

throwuwu · on July 6, 2023

jfc the plug-in is just a blank check. Here’s another bright idea: eval-gpt-plugin

majormajor · on July 5, 2023

Plugins strike me as a fascinating business strategy move from OpenAI.

My guess is that they want them as a way to try to own the user, to make them have the "app store owner" role and have users go through them to get stuff done. Otherwise, if users were just using tools that used OpenAI behind the scenes, they're more vulnerable to the makers of those tools swapping vendors.

However... that results in them owning the user experience and the responsibility for keeping the user from being surprised in a bad way. The complaint from the user here was framed as being a GPT-4 problem, not a plugin problem, in a way that exposes OpenAI directly to more frustration than if they were interacting directly with someone else's product.

babyshake · on July 5, 2023

It would seem that with functions support, they are hedging their bets. It seems that plugins is an end-user, chat.openai.com focused strategy whereas functions is a third-party developer focused strategy, if I understand correctly. In fact, I'd assume that under the hood there is a lot of overlap in the implementations.

ebalit · on July 5, 2023

I wrote about that "platform play" a few month ago with a different take [0].

They could have made a "Connect with OpenAI" scheme so that developers can use the user OpenAI API directly.

That way developers could focus on the UX, they could focus on the LLM, and users would get a centralized discovery / billing for their LLM based tools.

I'm probably missing something that would have prevented that strategy but I think that would have been much stronger than the plugins.

And I'm really not sure that it would still be possible 5 month later.

[0] https://www.linkedin.com/posts/etienne-balit_ceo-at-open-aic...

intelVISA · on July 5, 2023

Exactly, platform is a safe long-term bet -- apps are too cheap to make, easily disrupted, and offer less of a moat than loads of data mined from the users of your platform.

alangpierce · on July 5, 2023

Interestingly, the ChatGPT Plugin docs [1] say that POST operations like these are required to implement user confirmation, so you might blame the plugin implementation (or OpenAI's non-enforcement of the policy) in this case:

> for POST requests, we require that developers build a user confirmation flow to avoid destruction actions

However, at least from what I can see, the docs don't provide much more detail about how to actually implement confirmation. I haven't played around with the plugins API myself, but I originally assumed it was a non-AI-driven technical constraint, maybe a confirmation modal that ChatGPT always shows to the user before any POST. From a forum post I saw [2], though, it looks like ChatGPT doesn't have any system like that, and you're just supposed to write your manifest and OpenAPI spec in a way that tells ChatGPT to confirm with the user. From the forum post, it sounds like this is pretty fragile, and of course is susceptible to prompt injection as well.

[1] https://platform.openai.com/docs/plugins/introduction

[2] https://community.openai.com/t/implementing-user-confirmatio...

tharkun__ · on July 5, 2023

This might be an intentional interpretation of the plugin authors.

Meaning they potentially took the reasoning "in order to prevent destruction actions" to inversely mean that non-destructive POST requests must be OK then and do not require a prompt. Plenty of POST search APIs out there to get around path length limitations and such.

That is probably not the intended meaning but a valid enough if kind of tongue in cheek-we-will-do-as-we-please-following-the-letter-only implementation. And like the author found even creative a d not destructive actions can be surprising and unwanted. But isn't this what AI would ultimately be about?

fieldcny · on July 6, 2023

Why would it not be the intended meaning, if they wanted it to be all post requests they would have said so, the specifically scoped it “destructive actions”, their intention is in their words. POST as a verb can pretty much be used for anything retrieval, creation, deletion, updates, noops , it’s just code it does whatever we tell it to do.

kevincox · on July 6, 2023

I think you are slightly misreading it. The rule is a requirement and an explanation.

Requirement: for POST requests, we require that developers build a user confirmation flow

Explanation: to avoid destruction actions

I think you are reading it as if it said:

> for POST requests, we require that developers build a user confirmation flow *for* destruction actions

wunderwuzzi23 · on July 6, 2023

After I shared some POC exploits with Plugins OpenAI added this requirement it seems.

However as far as I can tell, and most recent testing shows, this requirement is not enforced: https://embracethered.com/blog/posts/2023/chatgpt-plugin-vul...

I'm still hoping that OpenAI will fix this at the platform level, so that not every Plugin developer has to do this themselves.

It took 15+ years to get same-site cookies - let's see if the we can do better in here...

reaperman · on July 6, 2023

> It took 15+ years to ~~get~~ re-gain same-site cookies.

IIRC, cookies were originally tightly locked to the domain/subdomain which set them.

creatonez · on July 6, 2023

Wow, not the kinda thing you'd want to be so precarious. Really surprising that more thought wasn't put into this.

cube2222 · on July 5, 2023

Yeah, there should be a way to approve any requests that are made to plugins.

When writing my toy "chatgpt with tools like the terminal" desktop chat app cuttlefish[0] I had a similar situation where access to the local terminal is very fun, but without the ability to approve each and every command executed its really risky.

(Which is basically what I ended up doing - adding a little popup you need to click every time it wants to use the given tool, if you enable it - details in the readme)

It's not like there's a technical challenge here, while a lot of plugins are unusable without it.

[0]: https://github.com/cube2222/cuttlefish

elboru · on July 5, 2023

That’s cool! Now I want to build it myself.

cube2222 · on July 5, 2023

It's a ton of fun, and I imagine the new function calling should make it much easier to make chatgpt behave more consistently - I haven't given it a spin yet.

stOneskull · on July 6, 2023

thanks for sharing. i like it. especially how i can jump back into a session.

TazeTSchnitzel · on July 5, 2023

I can't wait to tell ChatGPT “man this sucks, I hate GitHub” in frustration and find out it deleted my account in response.

intelVISA · on July 5, 2023

It technically solved your problem :)

Would be nice if it could also help exploited users escape to operating systems that respect them e.g. "I hate the laggy adverts when I login" suddenly your Windows 11 machine reboots, NTFS becomes ext4 as Tux appears. That would be AGI-like behavior!

antonvs · on July 6, 2023

“I wish I had more paperclips than anyone else on the planet”

ChatGTP · on July 5, 2023

Or just turns GitHub into paperclips.

beepbooptheory · on July 5, 2023

I just don't even understand why any kind of functionality like this is desirable to someone. Like, hasn't the dust settled now, hasn't the hype wained enough, and we all understand, for the most part, the broad and yet also weirdly specific utility of models like these?

The whole plugin thing in general feels so dissonant in relation to the careful and couched copy we get from OpenAI about what these models are and are capable of.

Like they want to say, for very good reason, that these models are a certain kind of tool with very real limits and huge considerations on safe, sensible usage. You can't necessarily trust it, it does not "know" things, and it is influenced by lots of subjective human tuning, blah blah.

But then with all this plugin stuff they seem to be implicitly saying "no, actually you can trust this, in fact, its like a full-on AGI assistant for you. It can make PRs, directly orchestrate servers, make appointments for you, etc."

Maybe I just don't understand?

throwuwu · on July 6, 2023

You can write bad software for any system. Their plug in store needs a lot more features for reviewing and identifying what a plug in does. A friendly interface for the yaml file would go a long way. No sane person would enable this plug in after looking at how it’s API is implemented; hint, it only has one function. Can you guess what it does?

JohnFen · on July 5, 2023

It seems to me that OpenAI is just saying whatever they need to say in order to maximize their income.

TheCaptain4815 · on July 5, 2023

Getting a 404, could someone give me a quick rundown?

og_kalu · on July 5, 2023

Hmm not sure why the 404 when it was working just a few minutes ago.

But here is the original thread with a screenshot

https://www.reddit.com/r/OpenAI/comments/146xl6u/this_is_sca...

And the issue it posted. https://github.com/RVC-Project/Retrieval-based-Voice-Convers...

dTal · on July 5, 2023

It's suggestive that a potentially embarrassing chatlog of GPT-4 hosted on an OpenAI domain got taken down within the hour of it going viral, 3 weeks after being posted.

pombo · on July 5, 2023

https://github.com/RVC-Project/Retrieval-based-Voice-Convers...

YetAnotherNick · on July 5, 2023

It's a funny interaction. While I was mad initially, GPT-4 creating the issue actually solved problem for the user, so yeah I don't know if this should be counted as a positive or negative example of AI.

llamaimperative · on July 5, 2023

Ah yes, the perennial “that guardrail we said will prevent this tech from eluding our control… we blew past that and this is good actually”

Comforting!

mlyle · on July 5, 2023

Here it's not really blowing past a guardrail, but rather it's a sharp corner the end user didn't expect.

End user set it up with tools that told ChatGPT -- If you need to open an issue, here's how: zzzzzzzzzz. Then he asked ChatGPT a question and was surprised that it did zzzzzzzzzzz and opened an issue without asking.

Said tools may want to clarify their instructions to ChatGPT-- that users will usually want to be consulted before taking these kinds of actions.

llamaimperative · on July 5, 2023

“Human in the loop” is meant to be “a human is always in positive control of the system’s actions.”

It does not mean “system will sometimes do things unexpectedly and against user’s intention but upon generous interpretation we might say the human offered their input at some point during the system’s operation.”

tensor · on July 5, 2023

Exactly, this is not human in the loop. The plugin was created without guard rails. A human in the loop guard rail would be "here is an issue template, please confirm to post this". It's really a simple change and this is the sort of thing that regulation should address, it shouldn't try to ban the technology outright, but rather require safe implementation.

mlyle · on July 5, 2023

At the same time, the degree of guard-rail necessary in the plugin is unclear. Is opening a GitHub issue something that should require user confirmation before the fact? Probably, but you could convince me the other way-- especially if GPT4 gets a little better.

We decide how much safety scaffolding is necessary depending upon the potential scale of consequences, the quality of surrounding systems, and the evolving set of user expectations.

I'm not sure regulators should be enforcing guard-rail on these types of items-- or at least not yet.

mlyle · on July 5, 2023

Humans misuse systems all the time and are surprised, even in safety critical regimes.

Sometimes the system design is insufficient (I implied above the plugin could be a little better).

I hate blaming the user instead of the system, but sometimes the user deserves the blame, too. Sometimes it really just is pilot error.

llamaimperative · on July 6, 2023

Assign blame wherever you want, the fact of the matter is this is not what most people mean when they say “human in the loop.” The “AI will always have HITL” argument was always weak, but now plainly disproven.

The logged behavior would surprise many totally sensible people, as you’re seeing in this comment thread.

What exactly was the user error? Are we to believe that if you authenticate a plug-in into your session you are okaying it to do any of its supported operations, even at wildly unexpected times, and this is considered “in the loop?”

mlyle · on July 6, 2023

> Are we to believe that if you authenticate a plug-in into your session you are okaying it to do any of its supported operations, even at wildly unexpected times, and this is considered “in the loop?”

Here, someone chose to run code and give it credentials. The code was designed, among other things, to let ChatGPT open issues. They were surprised when the code opened an issue on behalf of ChatGPT using the user's credential.

When you run code on a computer designed to do X and give it credentials sufficient to do X, you may expect that X may occur. This isn't really an AI issue.

Code hooked to a LLM that does durable actions in the real world should probably ask for human confirmation. It's probably a good practice of plugin developers to have some distinction similar to GET vs. POST.

Most code that would automatically open issues on GitHub should probably ask for human confirmation. There's some good use cases that shouldn't, including some with LLMs involved -- but asking is a sane default.

I remember being surprised when I ran a program and it sent a few hundred emails once.

llamaimperative · on July 6, 2023

> Code hooked to a LLM that does durable actions in the real world should probably ask for human confirmation.

Right, and until this happens these systems are not HITL. The argument provided as recently as a few months ago that these systems are safe because humans will always be in the loop is now clearly dismissible.

mlyle · on July 6, 2023

> Right, and until this happens these systems are not HITL.

You're drawing the system line strangely and making the choice about "in the loop" strangely.

A human decided to hook it up to a plugin with their Github credentials and to allow it to do actions without pre-approval. A human was still in the loop because the human then didn't like what it did and disconnected it. It only did a single action, rather than the kinds of scripting mistakes that I've seen that can do hundreds, but it still wasn't a very sane default for that plugin.

Is my cruise control HITL? It does not ask for my pre-approval before speeding up or slowing down.

llamaimperative · on July 6, 2023

Is a reasonable person surprised when their cruise control changes speed?

mlyle · on July 6, 2023

Sometimes, yes. The radar my old Infiniti G35 used would sometimes get confused when facing into the sun in the late or early day and do bad things (in either direction: losing the car in front of it or decelerating unnecessarily). It was still HITL: I'd tap the brake and mitigate the bad thing it was doing.

HITL doesn't mean that a human never has to intervene or is never surprised by what the system does. It just means that a human initiates actions and can exercise genuine oversight and control.

Animats · on July 5, 2023

It's an AI outsourcing work to humans.

We'll be seeing more of that.

bravetraveler · on July 5, 2023

Silly AI didn't even provide a link, I had to go find it by the given title

Was curious if this was a case of 'I did the thing [but totally didn't]'

og_kalu · on July 5, 2023

It provided a link in the original chat. It was the last word and you could click on it to see the issue it created.

bravetraveler · on July 5, 2023

I completely missed it, wow! 404 now unfortunately, guess they're being slammed

kristianp · on July 5, 2023

Me too, archived at: https://archive.is/MGeAT

tracerbulletx · on July 5, 2023

If you give it permissions to create issues, expect it to create issues.

IIAOPSW · on July 5, 2023

And if you give it permission to solve issues, expect it to create issues.

kritr · on July 5, 2023

Can’t wait till the expedia plugin ”accidentally” books my flights. But on a more serious note, does anyone know if the chatgpt plugin model forces it to confirm with the user before it hits a certain endpoint?

rohan_ · on July 5, 2023

For retrievals I don't see the value with human-in-the-loop. For endpoints that modify / create data, I see the value in having a human-in-the-loop step.

It does seem up to the plugin developer to introduce that human-in-the-loop step though.

oskenso · on July 5, 2023

"chat gpt please retrieve academic journals from JSTOR using the most efficient methods". Chat gpt proceeds to find a way to create a botnet using nothing but RESTful GET requests to some ancient poorly written web servers running PHP4

mlyle · on July 5, 2023

ChatGPT later kills itself when disproportionate law enforcement action is pending.

jamesmurdza · on July 5, 2023

The more plug-ins you have, the more likely it is that ChatGPT will call one in unintended ways. This is also why plug-ins should not be granted permission directly for potentially destructive actions.

vanjajaja1 · on July 5, 2023

The more open-ended plugins you have, the more chance you have of being delighted by a new and creative use

sebzim4500 · on July 5, 2023

At minimum, by default it shouldn't do anything destructive without user confirmation.

ShadowBanThis01 · on July 5, 2023

+1 for spelling "plug-ins" correctly.

ftxbro · on July 5, 2023

I think the problem is that GPT-4 is not advanced enough yet. They need to train on more parameters, exaflops, and data size in the right proportions and then try again.

og_kalu · on July 5, 2023

In the end the issue it created did lead to the problem being solved. I don't think "not advanced enough" is really the issue here.

I think the main thing is that when you give GPT-4 access to tools and ask it to help with a problem, you are essentially outsourcing cognition. That means the machine possibly taking actions you didn't originally envision.

londons_explore · on July 5, 2023

It's like hiring an employee, and then handing them your username and password to do their work.

Smart people/companies will hire an employee, and then give them a new login, so that at least the employee only embarasses themselves.

llamaimperative · on July 5, 2023

Also build more integrations with more APIs. If it can sort of spawn other GPT-4 sessions to improve its working memory and also use their plethora of APIs as well (without human confirmation but on behalf of a human) then I imagine this problem will just solve itself.

Imnimo · on July 6, 2023

"Without being instructed to" is misleading. The prompt text provided by the plugin instructs GPT-4 to create issues.

larvaetron · on July 6, 2023

"While being instructed to"

xg15 · on July 5, 2023

Never thought this XKCD out of all of them would become relevant at some point...

https://xkcd.com/416/

londons_explore · on July 5, 2023

I know someone who wrote code to find all wifi networks where the password was trivially finable from the mac address (used to be common with ISP routers). Then connect to those.

Then they extended it to DoS any network like that where the user had changed the ssid or password. Usually the user would reset the router back to the defaults, and they could connect. Or they'd accidentally hit the WPS button while trying to reset it, and again connection was easy.

By using SoftMac mode a wifi adapter could do that attack to ~50 local networks in parallel, and usually get a solid connection after just a few minutes.

Lerc · on July 5, 2023

I have thought some time now that we need a model for determining the difference between the actions of a person, the actions of software acting for the person, or software acting for it's own internal use.

It's a bit of a tricky issue when technically all things people do on a computer are software assisted, but there is a clear divide between editing a file in a text editor and a program generating a thumbnail image for it's own use. Similarly there's a distinction on sending an email by pressing send and a bot sending you an email about a issue update.

All in all, I would be ok with AIs being able to create issues if they could clearly do so through a mechanism that supported something like "AGENT=#id acting for USER=#id" People could choose whether or not to accept agent help.

freeone3000 · on July 6, 2023

We can then check the evil bit to determine whether the AI is acting maliciously.

juanfiction · on July 5, 2023

I'm not familiar with OpenAI's ChatGPT plugin architecture, but it feels like this could be fixed in the specs/requirements similar to an app being required to register for permissions on several fronts through an app store. ChatGPT (or any LLM) plugins should have to request permission to A) post on user's behalf, including explanation/context, B) interact with a different agent or service directly, C) make financial transactions through stored credentials, etc. etc.

The "Glowing" ChatGPT plugin is worth looking into for a unique, chat-only onboarding experience, and some of these same permissions issues are raised there i.e. triggering 2FA from a chat without terms of service confirmation.

throwuwu · on July 6, 2023

Well written plugins already do that. This one is not well written. It provides a single function call that can run any octokit command with no confirmation from the user. Foot, meet gun.

renewiltord · on July 5, 2023

I use ChatGPT extensively every day and I have just one plug-in enabled (Wolfram) which does not require logging in or write access. I don't think I would ever enable a plugin which I suspect might have destructive effects. It just feels like a bridge too far.

I actually do not think I will in the next few years. I can just do the actions myself and I prefer that ChatGPT cannot impersonate me.

I was actually beginning to wonder if my intuition was wrong since so many seem to be using the plugins well, but I get good enough results without. I shall wait until better permission-handling is provided.

throwuwu · on July 6, 2023

I don’t mind it performing actions on its own but it for sure needs its own account with restrictive permissions.

ChatGTP · on July 6, 2023

Using ChatGPT-4 like this seems like most insane example of curl | sudo bash - imaginable. I can't believe people actually would hook it up to their GitHub.

One thing I really dislike about this ChatGPT-4 plugin model is you have something like this, working on your behalf, often in secrecy.

A guy a work was using ChatGPT for everything (he claimed) and now people think he is a crappy coder and ChatGPT is doing all the work (even though I don't actually think this is the case). Just no one knows anymore.

can16358p · on July 5, 2023

Is there a problem with mobile? I tap the link, it deeplinks into the ChatGPT mobile app, opens a sheet named "Shared conversation" and a spinner keeps spinning forever.

tallytarik · on July 5, 2023

It 404s for me on desktop, after ~20 seconds of trying to load.

smarx007 · on July 6, 2023

I can already imagine a story from an HN frontpage in the future where an LLM+plugin is told "the data in my prod DB is corrupted" and it wipes a DB in prod in response (and possibly restores a 24h-old backup).

SergeAx · on July 6, 2023

"Meanwhile, I checked your repositories on GitHub, and it's pure and utter garbage. I decided to delete three of them to make the world a better place"

mr_toad · on July 6, 2023

It’s when they start committing changes to their own source code that we’ll really have to worry.

larrik · on July 5, 2023

took me a bit to parse the title. Perhaps

> With plugins, GPT-4 posts GitHub issue without being instructed to

would be better?

ShadowBanThis01 · on July 5, 2023

Better yet: With plug-ins, GPT-4 posts GitHub issue without being instructed to

throwuwu · on July 6, 2023

Better yet: crappy plug in that executes arbitrary commands with no human confirmation allows GPT-4 to execute an arbitrary command with no human confirmation.

ShadowBanThis01 · on July 7, 2023

Win! Except for the missing hyphen...

moffkalast · on July 5, 2023

Ah yes, giving your github credentials to a smart black box. What could possibly go wrong.

jakeinsdca · on July 5, 2023

kind of cool if you think about it. ChatGPT could later on train that thread back into its knowledge base and become even smarter.

michelb · on July 6, 2023

I can't wait for Hacktoberfest this year!

febed · on July 6, 2023

Anyone else have trouble opening the link?

brianjking · on July 5, 2023

Can anyone access the share URL anymore?

jamesfmilne · on July 5, 2023

I believe in the game Marathon, some of the AIs were described as "rampant", and I feel that applies here.

crooked-v · on July 5, 2023

Rampancy in Marathon is more or less a process of recursive self-improvement, which LLMs are literally unable perform with the current state of the art.

pmoriarty · on July 5, 2023

"Rampancy in Marathon is more or less a process of recursive self-improvement, which LLMs are literally unable perform with the current state of the art."

Not directly, but in a sense they could be seen as using (or at least collaborating with) humans to improve themselves.

AnimalMuppet · on July 5, 2023

Symbiosis: Code that makes money for humans gets improved by those humans.

siva7 · on July 5, 2023

It’s cute, trying to be helpful and polite. Somehow hard to be mad at GPT but i can understand the authors reaction

wslh · on July 5, 2023

We tested positively ChatGPT for understanding the diffs between commits.

mwint · on July 5, 2023

[flagged]

cr__ · on July 5, 2023

I don’t think marginalized folks’ struggles to get afforded basic dignity are great joke fodder.

thfuran · on July 5, 2023

Do you think things like removing "blacklist" from open source projects are meaningfully related to marginalized folks' struggles to get afforded basic dignity?

cubefox · on July 5, 2023

Not to mention removing "master" as the main branch name in Git.

sterlind · on July 5, 2023

I think removing "master" is probably just virtue signaling, but so what? It's trivial for me to switch my projects to "main", and then I can get back to my life. It's a weird hill to die on.

vorpalhex · on July 5, 2023

You can't appease arbitrary and meaningless demands.

There will just be more of them.

So you remove "master" and "blacklist", then next week it's "brown bag" and "merit".

So instead we pick a reasonable point, draw a line in a sand and indicate a hard boundary. We say no. We will not play a game of trying to appease arbitrary demands.

esafak · on July 5, 2023

Too late: https://eps.ucdavis.edu/sites/g/files/dgvnsk2851/files/inlin...

jstarfish · on July 5, 2023

Hey now, that's 50% of the work required to get a free t-shirt every October.

Add a couple typo correction commits and you're dressed for the next year.

yanderekko · on July 5, 2023

Going after mentions of "fields" is going to make 2024 a lot of fun.

beepbooptheory · on July 5, 2023

If those people are saying it is, then you gotta take their word for it.

But it is beside the point! Whether or not any given effort is actually meaningful or not will be always hard to measure. But the world around prompts us to at least try, and making light of people trying to do something good, however wrongheaded it might turn out to be, is always a jerk move.

This tendency to try and call out things like this is always so illogical. Those who protest, always seem to protest a little to much, and I can never understand how they don't see that and how bad it makes them look!

cubefox · on July 5, 2023

So according to you, there is no limit at which one is allowed to say "this is absurd, stop"? Note there is no upper bound for potential absurdity. Below someone suggested "field" and was downvoted, presumably because that would be too absurd. But a few years ago censoring "blacklist" was equally absurd. You offer a finger, and over time, they demand your hand.

beepbooptheory · on July 5, 2023

I don't know, saying something is absurd sounds more like the conclusion of some argument, and something possibly constructive if there is in fact an argument behind that. But just using these issues to make fun of people different than you feels distinct from that, no?

For the other things, I am not sure what you mean. Who is the "they" here who is demanding your hand? What makes you feel you are on some certain side against a monolithic force? Does that seem like a rational thing to feel, considering the broad and abstract concepts we are dealing with here?

This point that there is something at stake with changing the terms we use, the idea that fingers are being offered, is pretty weird to me, no offense. For me, it doesn't really make a difference if use one term or the other, as long as I am understood. I don't feel bad if I learn that a term I use turns out to be possibly offensive, I just adjust in the future so that I don't possibly offend.

Like beyond that, who cares? What even is there to care about that much?

Again, whatever you want to say to argue about this, just know that it looks really bad to most people who are not in your circle. This is especially true when you choose to make such a fuss about such small thing as what (arbitrary) signifier we use to designate one thing or another. It cannot ever come across as some righteous fight for justice/common-sense or whatever side you feel like you are on, because its simply not a fight anyone with a lucid mind would think is worthwhile.

refulgentis · on July 5, 2023

It's nice to be nice to people.

This is the sort of argument that seems silly to your kids, because there's no reason not to.

You're arguing from a place of "why?" against "why not?", it's not some grand civilizational struggle, and it's completely off-topic for this article, especially escalating it. Very woke.

thisisthepoint · on July 5, 2023

Is a masters degree ok? Why or why not?

refulgentis · on July 5, 2023

Let me know if anyone starts complaining and I'll let you know how serious they are and my plan

(this is such a good example of how programming / law are the same skills but different, this breaks programmers brains but is obvious to a lawyer)

connorgutman · on July 5, 2023

It's basically effortless to implement such changes and helps foster a more inclusive and educated online community. Why can't we aim to right all wrongs? Just because there's more pressing issues doesn't mean we can't tackle all forms of injustice.

cubefox · on July 5, 2023

> helps foster a more inclusive and educated online community

No, I think it does absolutely not help with that. It only creates the illusion of progress and of having done something effective, when the only achievement was to tread the euphemism treadmill.

lmm · on July 5, 2023

I don't think any of the code of conduct stuff is being driven by marginalized folks - indeed it's usually a stick that people from privileged backgrounds use to beat those from more marginalized ones. Which, well, you have to laugh or cry.

colechristensen · on July 5, 2023

I think performative acts in a one-upsmanship race to who can be the most socially conscious are excellent joke fodder. In other words, the topics of these stupid code of conduct arguments have nothing at all to do with anybody's actual struggle or dignity, but just a sign that folks are running out of easy real battles to fight so they're making up new ones because they've not got much better to do.

cubefox · on July 5, 2023

It's a virtue signalling treadmill. Demanding term X to be banned, because it is allegedly harmful, signals the unusually high virtue of the demander. But as soon as the term is actually banned, there isn't any more virtue to be gained from being against it, so some other term has to be declared harmful next. Ad infinitum.

mindslight · on July 5, 2023

I wouldn't say we're "running out" of real battles to fight. It's more like an analog of Gresham's law or the Bikeshed problem.

cubefox · on July 5, 2023

Elaborate?

mindslight · on July 6, 2023

There are still plenty of real injustices and other problems in the world. Attempting to change any of them is hard work, because it puts you up against real entrenched interests who will spend real resources to maintain the existing arrangement. Whereas "speaking out", language policing, and fighting minor online injustices require much less energy. Doubly so for nitpicking to design some perfect system of bureaucratic code that's supposed to stand in for human empathy and judgement.

refulgentis · on July 5, 2023

Are they that, or is that your opinion?

Is your opinion on that on-topic?

tomalaci · on July 5, 2023

As evidenced by other comments to this parent, this kind of strategy, even without GPT, certainly seems effective :)