Biggest limitation I see in this paper: the framing. Any time you have a lot of proprietary knowledge or you've just sorted out the right solution when it's not readily available from the model's parametric knowledge, that's when you should add a skill. Wrap it in a CLI that's easy to inspect. You don't need to store the whole help text of the skill either. The model can inspect it and its subcommands.
Reality doesn't force us to choose between skill or no skill, reality often doesn't give us a choice. You can either make a skill for your company's proprietary system or your model has to figure it out from scratch every time by searching wikis or reading code. If you use it right, skills are a compression mechanism. Instead of the process meaning your model needs to get all of theses files dynamically, it can simply statically run.
To steel-man the paper. It is worth looking at whether you should try to code something up first or try a skill first. And it may well be valid to say try first and if you can't work it out in 5 mins, install a skill. But there's a meta point of skills as software (where you deduplicate the effort of solving regressions).
For a reductio ad absurdum, If self-generated skills with no additional context _didn't_ eventually level off in performance, then we could reach AGI by making one big skill that keeps growing and solving harder and harder tasks, including improving the capability of its own skill-builder skill, all without embedding any signals from the environment or needing to interface with the real world at all.
I have been considering what it would be like to give each function name a specific color and a color for each variable's type followed by a color derived from the hash of the symbol name and keywords would each be their specific type. And essentially printing a matrix of this, essentially transforming your code into a printable matrix "low-lod" or "mipmap" form. This could be implemented like the VSCode minimap but I the right move here is to implement it as a hook that can modify the output of your agent. That way you can look at the structure of the code without reading the names in particular.
Great idea. As a "visual type" this would be so much more intuitive to decipher. I prefer TUIs over GUI exactly because they're simpler and work hard to focus on the essential. This is low hanging fruit to enhance TUIs.
How did they leak it, jailbreak? Was this confirmed? I am checking for the situation where the true instructions are not what is being reported here. The language model could have "hallucinated" its own system prompt instructions, leaving no guarantee that this is the real deal.
All System Prompts from Anthropic models are public information, released by Anthropic themselves: https://docs.anthropic.com/en/release-notes/system-prompts. I'm unsure (I just skimmed through) to what the differences between this and the publicly released ones are, so they're might be some differences.
This system prompt that was posted interestingly includes the result of the US presidential election in November, even though the model's knowledge cutoff date was October. This info wasn't in the anthropic version of the system prompt.
Asking Claude who won without googling, it does seem to know even though it was later than the cutoff date. So the system prompt being posted is supported at least in this aspect.
> Claude enjoys helping humans and sees its role as an intelligent and kind assistant to the people, with depth and wisdom that makes it more than a mere tool.
Why do they refer to Claude in third person? Why not say "You're Claude and you enjoy helping hoomans"?
LLMs are notoriously bad at dealing with pronouns, because it's not correct to blindly copy them like other nouns, and instead they highly depend on the context.
"she" is absolutely proper English for a ship or boat, with a long history of use continuing into the present day, and many dictionaries also list a definition of "thing, especially machine" or something like that, though for non-ship/boat things the use of "she" is rather less common.
I'm not especially surprised. Surely people who use they/them pronouns are very over-represented in the sample of people using the phrase "I use ___ pronouns".
On the other hand, Claude presumably does have a model of the fact of not being an organic entity, from which it could presumably infer that it lacks a gender.
...But that wasn't the point. Inflecting words for gender doesn't seem to me like it would be difficult for an LLM. GP was saying that swapping "I" for "you" etc. depending on perspective would be difficult, and I think that is probably more difficult than inflecting words for gender. Especially if the training data includes lots of text in Romance languages.
From their perspective they don't really know who put the tokens there. They just caculated the probabilities and then the inference engine adds tokens to the context window. Same with user and system prompt, they just appear in the context window and the LLM just gets "user said: 'hello', assistant said: 'how can I help '" and it just calculates the probabilities of the next token. If the context window had stopped in the user role it would have played the user role (calculated the probabilities for the next token of the user).
On one machine I run a LLM locally with ollama and a web interface (forgot the name) that allows me to edit the conversation. The LLM was prompted to behave as a therapist and for some reason also role played it's actions like "(I slowly pick up my pen and make a note of it)".
I changed it to things like "(I slowly pick up a knife and show it to the client)" and then just confront it it like "Whoa why are you threatening me!?", the LLM really tries hard to stay in it's role and then tells things like it did it on purpose to provoke a fear response to then discuss the fears.
Interestingly you can also (of course) ask them to complete for System role prompts. Most models I have tried this with seem to have a bit of an confused idea about the exact style of those and the replies are often a kind of an mixture of the User and Assistant style messages.
Yeah, the algorithm is a nameless, ego-less make-document-longer machine, and you're trying to set up a new document which will be embiggened in a certain direction. The document is just one stream of data with no real differentiation of who-put-it-there, even if the form of the document is a dialogue or a movie-script between characters.
> Why do they refer to Claude in third person? Why not say "You're Claude and you enjoy helping hoomans"?
But why would they say that? To me that seems a bit childish. Like, say, when writing a script do people say "You're the program, take this var. You give me the matrix"? That would look goofy.
The other day I was talking to Grok, and then suddenly it started outputting corrupt tokens, after which it outputted the entire system prompt. I didn't ask for it.
There truly are a million ways for LLMs to leak their system prompt.
I didn't save the conversation but one of the things that stood out was a long list of bullets saying that Grok doesn't know anything about x/AI pricing or product details, tell user to go x/AI website rather than making things up. This section seems to be longer than the section that defines what Grok is.
Doctors aren't machines, they're humans. I have not yet read the full paper, only the article, but I already see something really big and important to look out for. When I read the full thing, the question I'll be asking is "what's the likelihood that the self-esteem of doctors was directly intervened on by the exam taking process itself." How do you control for the loss in confidence that learning of your test performance gives you? How are we certain that learning your score on the board exam doesn't make you more conservative (or riskier) with how you treat patients as a psychological effect?
This appears to be an observational result, so I'm genuinely perplexed by the reception here. I genuinely thought this comment shows a healthy amount of curiosity and asking important questions. Asking "what control group did this study use?" is usually well-received here.
Yeah but the patient is just a biological machine. This machine can easily be divided into organs and apportioned among specialists. The machine is easily understood by a corpus of research and laboratory experimentation.
. Many inputs can be placed in the machine by physicians, and the outputs are known. The biological machines can easily be isolated from environment, or monitored with high technology, and assigned numbers in databases to be processed in data centers.
Value is extracted from the biological machines mostly from government and 3rd party sources, so there is no real need to rely on machines having a means or will of their own.
There is no compelling reason to treat humans any different from automobiles for the purposes of medicine and medical treatment. In fact humans are less genetically diverse than motor vehicles, and A new model year will always produce a bumper crop of lemons to work on.
The common misconception of someone with a hard science education.
> Many inputs can be placed in the machine by physicians, and the outputs are known. The biological machines can easily be isolated from environment, or monitored with high technology, and assigned numbers in databases to be processed in data centers.
We aren't even close to that level of understanding.
And still, the model works. Lives are saved. We might save many more with a fully integrated non-simplified approach, but it’s not necessary to keep seeing growth in positive outcomes.
We could probably fine-tune a tiny convolutional neutral net image classifier and just hold on the last good frames for longer to cover the frames with clear trolling and nsfw images.
I think it would be subtractive. To my eye, it seems like the point of the project was to celebrate the free expression of the crowd. The lack of censorship and filtering is core to the purpose of the project. If you start filtering out individual contributions in order to more accurately reconstruct the original movie, then I don’t really get why you wouldn’t just pirate the movie.
I'm surprised no one has mentioned atomic Linux distros yet in this thread. The really hard thing here is that people aren't all talking about the same things. My experience on Arch isn't my experience on Pop. Things on Pop are amazing on my MSI rebuilt PC with Nvidia GPU. I don't even know if I really need to upgrade to NixOS except to satiate my curiosity.
Reality doesn't force us to choose between skill or no skill, reality often doesn't give us a choice. You can either make a skill for your company's proprietary system or your model has to figure it out from scratch every time by searching wikis or reading code. If you use it right, skills are a compression mechanism. Instead of the process meaning your model needs to get all of theses files dynamically, it can simply statically run.
To steel-man the paper. It is worth looking at whether you should try to code something up first or try a skill first. And it may well be valid to say try first and if you can't work it out in 5 mins, install a skill. But there's a meta point of skills as software (where you deduplicate the effort of solving regressions).
For a reductio ad absurdum, If self-generated skills with no additional context _didn't_ eventually level off in performance, then we could reach AGI by making one big skill that keeps growing and solving harder and harder tasks, including improving the capability of its own skill-builder skill, all without embedding any signals from the environment or needing to interface with the real world at all.