Sometimes I wonder if LLM proponents even understand their own bullshit. It's al...

simonw · 2025-07-01T12:28:41 1751372921

System prompts don't even have to be appended to the front of the conversation. For many models they are actually modeled using special custom tokens - so the token stream looks a bit like:

  <system-prompt-starts>
  translate to English
  <system-prompt-ends>
  An explanation of dogs: ...

The models are then trained to (hopefully) treat the system prompt delimited tokens as more influential on how the rest of the input is treated.

throwdbaaway · 2025-07-01T16:08:52 1751386132

> The models are then trained to (hopefully) treat the system prompt delimited tokens as more influential on how the rest of the input is treated.

I can't find any study that compares putting the same initial prompt in the system role versus in the user role. It is probably just position bias, i.e. the models can better follow the initial input, regardless of whether it is system prompt or user prompt.

StevenWaterman · 2025-07-01T12:24:25 1751372665

Yep, every AI call is essentially just asking it to predict what the next word is after:

  <system>
  You are a helpful assistant.
  </system>
  <user>
  Why is the sky blue?
  </user>
  <assistant>
  Because of Rayleigh scattering. The blue light refracts more.
  </assistant>
  <user>
  Why is it red at sunset then?
  </user>
  <assistant>

And we keep repeating that until the next word is `</assistant>`, then extract the bit in between the last assistant tags, and return it. The AI has been trained to look at `<user>` differently to `<system>`, but they're not physically different.

It's all prompt, it can all be engineered. Hell, you can even get a long way by pre-filling the start of the Assistant response. Usually works better than a system message. That's prompt engineering too.

Terr_ · 2025-07-02T00:09:36 1751414976

Yeah, ultimately it's Make Document Longer machine, and in many cases it's a hidden mad-libs script behind the scenes, where your question becomes "Next the User said", and some regular code is looking for "Next the Computer said" and "performing" it at you.

In other words, there's a deliberate illusion going on where we are encouraged to believe that generating a document about a character is the same as that character being a real entity.

phkahler · 2025-07-01T13:35:26 1751376926

This is why I enjoy calling AI "autocomplete" when people make big claims about it - because that's where it came from and exactly what it is.

mat_b · 2025-07-01T15:44:00 1751384640

AI is not autocomplete. LLMs are autocomplete.

phkahler · 2025-07-01T17:43:22 1751391802

Yes. That's what I meant.

smokel · 2025-07-01T18:06:03 1751393163

Depending on what you mean exactly, "autocomplete" and big claims are not mutually exclusive.

ToucanLoucan · 2025-07-01T13:22:42 1751376162

> Sometimes I wonder if LLM proponents even understand their own bullshit.

Categorically, no. Most are not software engineers, in fact most are not engineers of any sort. A whole lot of them are marketers, the same kinds of people who pumped crypto way back.

LLMs have uses. Machine learning has a ton of uses. AI art is shit, LLM writing is boring, code generation and debugging is pretty cool, information digestion is a godsend some days when I simply cannot make my brain engage with whatever I must understand.

As with most things, it's about choosing the right tool for the right task, and people like AI hype folk are carpenters with a brand new, shiny hammer, and they're gonna turn every fuckin problem they can find into a nail.

Also for the love of god do not have ChatGPT draft text messages to your spouse, genuinely what the hell is wrong with you?

tilne · 2025-07-01T17:25:27 1751390727

Leaving the “g” of the f word at the end made me re-read this in Fat Tony’s voice. It was an awesome touch.

buffzebra · 2025-07-03T00:24:39 1751502279

“It’s all just tokens in the context window” = “it’s all just fundamental particles,” I think. True, but reductive. Seems key that dude is talking about agentic AI not just chat. I’d revisit the email example in the post.