For ChatGPT: > Prompt caches are not shared between organizations. Only members ...

maxloh · 2025-12-19T17:17:58 1766164678

I don't find it really viable. There are so many ways to express the same question, and context does matter: the same prompt becomes irrelevant if the previous prompts or LLM responses differ.

With the cache limited to the same organization, the chances of it actually being reused would be extremely low.

qeternity · 2025-12-19T21:38:30 1766180310

In a chat setting you hit the cache every time you add a new prompt: all historical question/answer pairs are part of the context and don’t need to be prefilled again.

On the API side imagine you are doing document processing and have a 50k token instruction prompt that you reuse for every document.

It’s extremely viable and used all the time.

jonhohle · 2025-12-19T21:45:37 1766180737

I’m shocked that this hasn’t been a thing from the start. That seems like table stakes for automating repetitive tasks.

qeternity · 2025-12-19T22:36:04 1766183764

It has been a thing. In a single request, this same cache is reused for each forward pass.

It took a while for companies to start metering it and charging accordingly.

Also companies invested in hierarchical caches that allow longer term and cross cluster caching.

IanCal · 2025-12-19T18:04:10 1766167450

It gets used massively in a conversation, also anything that has a lot of explain actions in the system prompt means you have a large matching prefix.

babelfish · 2025-12-19T17:50:49 1766166649

Think of it as a very useful prefix match. If all of your threads start with the same system prompt, you will reap benefits from prompt caching.