You can fix the people-pleasing mode thing by simply adding the words "be critical" to your prompt.
As for 4.5... I've been playing around with it all day, and as far as I can tell it's objectively worse than o3-mini-high and Deepseek-R1. It's less imaginative, doesn't reason as well, doesn't code as well as o3-mini, doesn't write nearly as well as R1, its book and product recommendations are far more mainstream/normie, and all in all it's totally unimpressive.
Frankly, I don't know why OpenAI released it in this form, to people who already have access to o3-mini, o1-Pro, and Deep Research -- all of which are better tools.
Hmm. I’m on the other side of this - this feels like what I imagined a scaled up gpt 4 would be: more nuanced and thoughtful. It did the best of any model at my “write an essay as if Hemingway went along with rfk jr when he left the bear in Central Park.” Actual prompt longer. This is a hard task because Hemingway’s prose is extremely multilayered, and his perspective and physical engagement are notable as well.
I’d say 4.5 is by far the best at this of released models. It’s probably the only one that thought through both what skepticism and connection Hemingway might have had along for that day and the combination of alienation posing and privilege rfk had. I just retried deepseek on it: the language is good to very good. Theory of mind not as much.
Edit: grok 3 is also pretty good. Maybe a bit too wordy still, and maybe a little less insightful.
What was your actual prompt? I just asked it for that Hemingway story and the result didn't impress me -- it had none of the social nuance you mentioned.
As for 4.5... I've been playing around with it all day, and as far as I can tell it's objectively worse than o3-mini-high and Deepseek-R1. It's less imaginative, doesn't reason as well, doesn't code as well as o3-mini, doesn't write nearly as well as R1, its book and product recommendations are far more mainstream/normie, and all in all it's totally unimpressive.
Frankly, I don't know why OpenAI released it in this form, to people who already have access to o3-mini, o1-Pro, and Deep Research -- all of which are better tools.