I've had exactly the opposite experience with generating idiomatic code. I find that the models have a lot of information on the standard idioms of a particular language. If I'm having to write in a language I'm new in, I find it very useful to have the LLM do an idiomatic rewrite. I learn a lot and it helps me to get up to speed more quickly.
I wonder if there is a big disconnect partially due to the fact that people are talking about different models. The top tier coding models (sonnet, o1, deepseek) are all pretty good, but it requires paid subscriptions to make use of them or 400GB of local memory to run deepseek.
All the other distilled models and qwen coder and similar are a large step below the above models in terms of most benchmarks. If someone is running a small 20GB model locally, they will not have the same experience as those who run the top of the line models.
The top of the line models are really cheap though. Getting an anthropic key and $5 of credit costs you exactly that, and gives you hundreds of prompts.