Hacker Newsnew | past | comments | ask | show | jobs | submit | eigenblake's commentslogin

Biggest limitation I see in this paper: the framing. Any time you have a lot of proprietary knowledge or you've just sorted out the right solution when it's not readily available from the model's parametric knowledge, that's when you should add a skill. Wrap it in a CLI that's easy to inspect. You don't need to store the whole help text of the skill either. The model can inspect it and its subcommands.

Reality doesn't force us to choose between skill or no skill, reality often doesn't give us a choice. You can either make a skill for your company's proprietary system or your model has to figure it out from scratch every time by searching wikis or reading code. If you use it right, skills are a compression mechanism. Instead of the process meaning your model needs to get all of theses files dynamically, it can simply statically run.

To steel-man the paper. It is worth looking at whether you should try to code something up first or try a skill first. And it may well be valid to say try first and if you can't work it out in 5 mins, install a skill. But there's a meta point of skills as software (where you deduplicate the effort of solving regressions).

For a reductio ad absurdum, If self-generated skills with no additional context _didn't_ eventually level off in performance, then we could reach AGI by making one big skill that keeps growing and solving harder and harder tasks, including improving the capability of its own skill-builder skill, all without embedding any signals from the environment or needing to interface with the real world at all.


I have been considering what it would be like to give each function name a specific color and a color for each variable's type followed by a color derived from the hash of the symbol name and keywords would each be their specific type. And essentially printing a matrix of this, essentially transforming your code into a printable matrix "low-lod" or "mipmap" form. This could be implemented like the VSCode minimap but I the right move here is to implement it as a hook that can modify the output of your agent. That way you can look at the structure of the code without reading the names in particular.


Great idea. As a "visual type" this would be so much more intuitive to decipher. I prefer TUIs over GUI exactly because they're simpler and work hard to focus on the essential. This is low hanging fruit to enhance TUIs.


bd35a7f69b28c97fb3ebe489a4fba26a5f423522276d5ff5b5a8bb6441806ad2


How did they leak it, jailbreak? Was this confirmed? I am checking for the situation where the true instructions are not what is being reported here. The language model could have "hallucinated" its own system prompt instructions, leaving no guarantee that this is the real deal.


All System Prompts from Anthropic models are public information, released by Anthropic themselves: https://docs.anthropic.com/en/release-notes/system-prompts. I'm unsure (I just skimmed through) to what the differences between this and the publicly released ones are, so they're might be some differences.


This system prompt that was posted interestingly includes the result of the US presidential election in November, even though the model's knowledge cutoff date was October. This info wasn't in the anthropic version of the system prompt.

Asking Claude who won without googling, it does seem to know even though it was later than the cutoff date. So the system prompt being posted is supported at least in this aspect.


I asked it this exact question, to anybody curious https://claude.ai/share/ea4aa490-e29e-45a1-b157-9acf56eb7f8a

edit:fixed link


The conversation you were looking for could not be found.


oops, fixed


> The assistant is Claude, created by Anthropic.

> The current date is {{currentDateTime}}.

> Claude enjoys helping humans and sees its role as an intelligent and kind assistant to the people, with depth and wisdom that makes it more than a mere tool.

Why do they refer to Claude in third person? Why not say "You're Claude and you enjoy helping hoomans"?


LLMs are notoriously bad at dealing with pronouns, because it's not correct to blindly copy them like other nouns, and instead they highly depend on the context.


[flagged]


'It' is obviously the correct pronoun.


There's enough disagreement among native English speakers that you can't really say any pronoun is the obviously correct one for an AI.


"What color is the car? It is red."

"It" is unambiguously the correct pronoun to use for a car. I'd really challenge you to find a native English speaker who would think otherwise.

I would argue a computer program is no different than a car.


People often refer to their car and other people's as "she" ("she's a beauty") so you're is obviously wrong.


But no one who does that thinks they're using proper English!


"she" is absolutely proper English for a ship or boat, with a long history of use continuing into the present day, and many dictionaries also list a definition of "thing, especially machine" or something like that, though for non-ship/boat things the use of "she" is rather less common.


You’re not aligned bro. Get with the program.


I'm not especially surprised. Surely people who use they/them pronouns are very over-represented in the sample of people using the phrase "I use ___ pronouns".

On the other hand, Claude presumably does have a model of the fact of not being an organic entity, from which it could presumably infer that it lacks a gender.

...But that wasn't the point. Inflecting words for gender doesn't seem to me like it would be difficult for an LLM. GP was saying that swapping "I" for "you" etc. depending on perspective would be difficult, and I think that is probably more difficult than inflecting words for gender. Especially if the training data includes lots of text in Romance languages.


LLMs don’t seem to have much notion of themselves as a first person subject, in my limited experience of trying to engage it.


From their perspective they don't really know who put the tokens there. They just caculated the probabilities and then the inference engine adds tokens to the context window. Same with user and system prompt, they just appear in the context window and the LLM just gets "user said: 'hello', assistant said: 'how can I help '" and it just calculates the probabilities of the next token. If the context window had stopped in the user role it would have played the user role (calculated the probabilities for the next token of the user).


> If the context window had stopped in the user role it would have played the user role (calculated the probabilities for the next token of the user).

I wonder which user queries the LLM would come up with.


On one machine I run a LLM locally with ollama and a web interface (forgot the name) that allows me to edit the conversation. The LLM was prompted to behave as a therapist and for some reason also role played it's actions like "(I slowly pick up my pen and make a note of it)".

I changed it to things like "(I slowly pick up a knife and show it to the client)" and then just confront it it like "Whoa why are you threatening me!?", the LLM really tries hard to stay in it's role and then tells things like it did it on purpose to provoke a fear response to then discuss the fears.


Interestingly you can also (of course) ask them to complete for System role prompts. Most models I have tried this with seem to have a bit of an confused idea about the exact style of those and the replies are often a kind of an mixture of the User and Assistant style messages.


Yeah, the algorithm is a nameless, ego-less make-document-longer machine, and you're trying to set up a new document which will be embiggened in a certain direction. The document is just one stream of data with no real differentiation of who-put-it-there, even if the form of the document is a dialogue or a movie-script between characters.


I don’t know but I imagine they’ve tried both and settled on that one.


Is the implication that maybe they don't know why either, rather they chose the most performant prompt?


LLM chatbots essentially autocomplete a discussion in the form

    [user]: blah blah
    [claude]: blah
    [user]: blah blah blah
    [claude]: _____
One could also do the "you blah blah" thing before, but maybe third person in this context is more clear for the model.


> Why do they refer to Claude in third person? Why not say "You're Claude and you enjoy helping hoomans"?

But why would they say that? To me that seems a bit childish. Like, say, when writing a script do people say "You're the program, take this var. You give me the matrix"? That would look goofy.


"It puts the lotion on the skin, or it gets the hose again"


Why would they refer to Claude in second person?


> The language model could have "hallucinated" its own system prompt instructions, leaving no guarantee that this is the real deal.

How would you detect this? I always wonder about this when I see a 'jail break' or similar for LLM...


In this case it’s easy: get the model to output its own system prompt and then compare to the published (authoritative) version.

The actual system prompt, the “public” version, and whatever the model outputs could all be fairly different from each other though.


The other day I was talking to Grok, and then suddenly it started outputting corrupt tokens, after which it outputted the entire system prompt. I didn't ask for it.

There truly are a million ways for LLMs to leak their system prompt.


What did it say?


I didn't save the conversation but one of the things that stood out was a long list of bullets saying that Grok doesn't know anything about x/AI pricing or product details, tell user to go x/AI website rather than making things up. This section seems to be longer than the section that defines what Grok is.

Nothing about tool calling.


What's so special about this? Homo sapiens have been doing this for hundreds of thousands of years /s


Doctors aren't machines, they're humans. I have not yet read the full paper, only the article, but I already see something really big and important to look out for. When I read the full thing, the question I'll be asking is "what's the likelihood that the self-esteem of doctors was directly intervened on by the exam taking process itself." How do you control for the loss in confidence that learning of your test performance gives you? How are we certain that learning your score on the board exam doesn't make you more conservative (or riskier) with how you treat patients as a psychological effect?


This appears to be an observational result, so I'm genuinely perplexed by the reception here. I genuinely thought this comment shows a healthy amount of curiosity and asking important questions. Asking "what control group did this study use?" is usually well-received here.


Yeah but the patient is just a biological machine. This machine can easily be divided into organs and apportioned among specialists. The machine is easily understood by a corpus of research and laboratory experimentation.

. Many inputs can be placed in the machine by physicians, and the outputs are known. The biological machines can easily be isolated from environment, or monitored with high technology, and assigned numbers in databases to be processed in data centers.

Value is extracted from the biological machines mostly from government and 3rd party sources, so there is no real need to rely on machines having a means or will of their own.

There is no compelling reason to treat humans any different from automobiles for the purposes of medicine and medical treatment. In fact humans are less genetically diverse than motor vehicles, and A new model year will always produce a bumper crop of lemons to work on.


The common misconception of someone with a hard science education.

> Many inputs can be placed in the machine by physicians, and the outputs are known. The biological machines can easily be isolated from environment, or monitored with high technology, and assigned numbers in databases to be processed in data centers.

We aren't even close to that level of understanding.


And still, the model works. Lives are saved. We might save many more with a fully integrated non-simplified approach, but it’s not necessary to keep seeing growth in positive outcomes.


Soon they will be!


loss of confidence? lol what?


We could probably fine-tune a tiny convolutional neutral net image classifier and just hold on the last good frames for longer to cover the frames with clear trolling and nsfw images.


I think that would miss the point


No, that point was already made. This will be a new point unlike the previous point.


But i like the previous point. Why would you take it from me?


It's not taking away, a new point is by definition adding. Just like The Free Movie added to The Bee Movie.


I think it would be subtractive. To my eye, it seems like the point of the project was to celebrate the free expression of the crowd. The lack of censorship and filtering is core to the purpose of the project. If you start filtering out individual contributions in order to more accurately reconstruct the original movie, then I don’t really get why you wouldn’t just pirate the movie.



I'm surprised no one has mentioned atomic Linux distros yet in this thread. The really hard thing here is that people aren't all talking about the same things. My experience on Arch isn't my experience on Pop. Things on Pop are amazing on my MSI rebuilt PC with Nvidia GPU. I don't even know if I really need to upgrade to NixOS except to satiate my curiosity.


Absolutely yes, but not me personally. The keywords to search for are Plover, Stenotype https://youtu.be/jRFKZGWrmrM


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: