After using the API quite heavily, I believe both arguments (expectations change...

After using the API quite heavily, I believe both arguments (expectations changed vs. ChatGPT changed) are likely correct.

As with any technology, we should predict the novelty to wear off and the rough edges to become more apparent. Peoples’ expectations have changed.

On the flip side, the ChatGPT interface has also changed (regardless of the underlying model). Any context you add to an LLM prompt will steer the LLM’s output, better or not.

We know for a fact ChatGPT uses a different/additional prompt to the API, as ChatGPT always has the current date. This changed in the ChatGPT interface around May 13th, around the same time as OpenAI claimed the model was identical. The addition of Plugins/web browsing around that time also made it easier to pollute your prompt, if either were enabled.

We also know that ChatGPT is running on different infra (based on latency diffs), so even if it’s the same model, it’s possible it’s configured ever so differently.

And finally, we also know there’s a new model (woohoo function calling!).

As an API user, my personal experience with GPT-4 is very similar to when it first came out. The hype was very high (AGI in months!) and the reality has been quite different.

GPT-4 is an amazing mirror of society, but it’s only worth what you put in.