I've had this same problem using ChatGPT and German. Even for basic German hallucinations can be unexpected and problematic. (I don't recall the model, but it was a recent one.)
In one instance, I was having it correct akkusativ/dativ/nominativ sentences and it would say the sentence is in one case when I knew it was in another case. I'd ask ChatGPT if it was sure, and then it would change its answer. If pressed further, it would again change its answer.
I was originally quite excited about using an LLM for my language practice, but now I'm pretty cautious with it.
It is also why I'm very skeptical of AI-based language learning apps, especially if the creator is not a native speaker.
Would agentic workflows come in handy in these cases? I mean having a controller agent after the sentence is created, where this agent would be able to search the web or have access to a database? or personal notes and ensure everything is correct.
In one instance, I was having it correct akkusativ/dativ/nominativ sentences and it would say the sentence is in one case when I knew it was in another case. I'd ask ChatGPT if it was sure, and then it would change its answer. If pressed further, it would again change its answer.
I was originally quite excited about using an LLM for my language practice, but now I'm pretty cautious with it.
It is also why I'm very skeptical of AI-based language learning apps, especially if the creator is not a native speaker.