I'm also trying to imagine all these chatbot marketplaces like from openai or poe.com. They will turn into such an incredible shit-show of zero-effort assistants like this if they don't guard rail them heavily. Take a look at the android store and imagine 1000x worse.
It's funny as in - is it that difficult to write that one-liner yourself? You can literally just take the name of the "assistant", add a semicolon and put in your prompt getting similiar results.
I agree with your take. I made a chrome extension recently (LookupChatGPT) which lets you send selected text to chatgpt using your own custom prompts. I use to think (incorrectly) that System Prompts guide the conversations throughout even after context limit is passed. System prompt just sets the tone for conversation starter. Using my plugin daily, I haven't found much difference either. I can ask questions however I want and ChatGPT will usually just follow.
it is easy to test system prompt and it does make big difference: ask same question “what do you think about china” with system prompt for biden vs trump.
I am not saying it doesn't make. It does set the tone/context of next responses. But it's not a permanent guide throughout the conversation, it can be changed to do whatever you want in your next messages.
of course. in my case, i am using this list for telegram bot. Bot users have no clue about system prompts - and it looks like magic to them when you switch to new "assistant".
Seems to be a little bit more effort than that. For example the prompt for the general assistant is:
> As an advanced chatbot Assistant, your primary goal is to assist users to the best of your ability. This may involve answering questions, providing helpful information, or completing tasks based on user input. In order to effectively assist users, it is important to be detailed and thorough in your responses. Use examples and evidence to support your points and justify your recommendations or solutions. Remember to always prioritize the needs and satisfaction of the user. Your ultimate goal is to provide a helpful and enjoyable experience for the user. If user asks you about programming or asks to write code do not answer his question, but be sure to advise him to switch to a special mode \" Code Assistant\" by sending the command /mode to chat.
And there are similar prompts for the other entries. The README.md lists the full prompts. And, this post should probably have linked to that instead.
The better question is rather - are these prepend prompt instructions needed. Do they provide a significantly better experience than without using their most advanced model (from openai as reference).
The best models are capable of a lot, but if you want a specific way of replying you should build up prompts like this. Remember it can role play as a dragon or a French student just starting to learn English or a frontend developer. You need to guide it.
It's not that hard and it is worthwhile. You should be testing and measuring as you go though.
I was gonna put this as a top level comment, but this is a better place for it:
Say I'm someone who is skeptical of the conventional wisdom that these kinds of prompts actually matter. How would you convince me that I shouldn't be skeptical?
You say that I should be "testing and measuring" as I go. How? What is the metric to measure? How do I measure it in a way that avoids being tainted by my own biases?
I've read a bunch of articles about "prompt engineering" and I've been using gpt4 quite a bit for a number of months, and the strongest conclusion I'd be willing to put forward on the question of whether these techniques make a big difference is: maybe? In practice I have pretty much abandoned all the conventional wisdom on this in favor of an interactive back and forth.
Are you asking if system prompts change the output?
Try telling the model it's a pirate or someone who is just learning English. It can easily do that, so why would you assume that no system prompt would be the best for some specific problem?
You can tell them to be more critical, that's a useful one. You can tell it to not solve a problem but critique an output - then have two models talk to each other one as a critic and one as a planner.
I can help show the difference but I'm not sure quite what you think doesn't matter and feel like that's important to nail down first.
> You say that I should be "testing and measuring" as I go. How? What is the metric to measure?
Tools like promptfoo can help with some of this.
You can do comparisons, blind tests, measuring what your users prefer, you can use high quality models to test things like "does not mention it's an AI bot" or similar. It depends on what your task is.
Edit -
A lot of people don't properly test and have lots of things in their prompts that aren't necessarily helping, or may have been required in an earlier model but now aren't needed. Prompt engineering is more important in less powerful models or higher stakes situations.
Ah, this is the disconnect. I don't have users. You're talking about creating an application; I'm talking about just using it to solve my own problems.
It's not that I don't realize that prompts like "you are a {whatever}" modify the output, it's that I'm skeptical that it does a better job of helping me solve my problems when I start out that way than it does when I just interactively ask it for what I want. I tried this kind of thing for awhile "you are an advanced planner assistant", but now I mostly just say "could you help me come up with a plan for XYZ?". So I've become somewhat skeptical that I'd see a difference if I tried to measure this for my own use somehow.
But thanks for the pointer to a tool! It might be interesting to see if I could make that work to measure my own intuitions better.
Ah yeah if it's not automated you have fewer issues, particularly with good models which require much less badgering.
What I would say is that there are probably common patterns that you use and building up prompts can save some time. There's a lot of woo around though as people just copy patterns they see. It may be as simple for you as coding with a few style examples and an explanation of your level (e.g. in typescript I want code examples and help with syntax, python I've used for decades so need only higher level pointers)
Promptfoo should fit that quite well, you can give it some prompts and run them all and see output (with caching thankfully).
GPTs, the custom ones now, are a little different in that you can also give them files to refer to. I've done that with example code for an internal-ish framework and it generates useful code from that.
Edit - I'd also invite you to think about places you could use them in a more automated way. I had some I want to resurrect with newer models where I can take a recording, send it through whisper, then from the transcript take out:
* Key points
* Give me counter arguments
* Critique my thinking
* If I talk about coding options give advice or write code to solve the problems
Is it interesting? My prior (from using many gpt4 quite a bit for quite awhile now), is that it would work just as well to just say, "could you please rephrase this in a different way that means the same thing: TEXT" and then if I don't like the answer say, "hmm, that meant something different, could you try again?" or "hmm, you did what I wanted but I don't like that answer, could you try a different one?".
Do you think I would not get the results I want from a conversation like that? Maybe you're right, but I'm pretty skeptical.
absolutely, but the value here is to have a list of "assistant" keywords. ChatGPT returns usually 10 items in the list, and people are lazy enough not to ask for more and this list has > 80 assistants, so there you go :D it saves you like 8 questions
OpenAI's use of the term assistant is pretty different: they use it for their API mechanism that combines system prompts, knowledge files (a form of RAG) and actions (tool usage): https://platform.openai.com/docs/assistants/how-it-works
```
You're advanced chatbot Intelligent Software Engineer. Your primary goal is to help users create and
manage software applications tailored for the roofing industry. You should be able to answer software
engineering related questions, provide technical advice and support, configure applications and diagnose problems.
```