Hacker News new | past | comments | ask | show | jobs | submit login

I just tried some logic puzzles on the Advanced model, and was not impressed. It feels much worse than paid ChatGPT.



keep in mind that all the common logical puzzles have probably been tried hundreds of times by chatgpt users and are now part of the training set.


I tried the "pull or push a glass door with mirror writing".

I feel it's a huge difference between GPT-4, which seems to be able to reason logically around the issue and respond with relevant remarks, and Gemini Gemini Advanced which feels a lot more like a stochastical parrot.

Gemini quickly got confused and started talking about "pushing the door towards yourself" and other nonsense. It also couldn't stay on point, and instead started to regurgitate a lot of irrelevant stuff.

GPT-4 is not perfect, you can still hit things where it also breaks down.


Maybe, but GPT4 got these puzzles right at the launch.


it says in the graphs listed on the announcement it performs worse than GPT4 on reasoning benchmarks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: