The bits GPT4 always gets wrong - and as you say, more and more wrong the further I try to work with it to fix the mistakes - are exactly the bits I want it to do for me. Tedious nested loops that I need to calculate on paper in particular.
What it's good for is high level overview and structuring of simple apps, which saves me a lot of googling, reviewing prior work, and some initial typing.
After my last attempts to work with it, I've decided that until there's another large improvement in the models (GPT5 or similar), I won't try to use it beyond this initial structure creation phase.
The issue is that for complex apps that already have a structure in place - especially if it's not a great structure and I don't have the rights or time to do a refactoring - the AI can't really do anything to help. So in this case, for new, simple, or test projects it'll seem like an amazing tool and then in the real world it's pretty much useless or even just wastes time, except for brainstorming entirely new features that can be reasoned about in isolation, in which case it's useful again.
A counterpoint is that code should always be written in a modular way so that each piece can be reasoned about in isolation. Which doesn't often happen in large apps that I've worked on, unfortunately. Unless I'm the one who writes them from scratch.
I can regularly get it to autocomplete big chunks of code that are good. But specifically only when it's completely mind numbingly boring, repetitive and derivative code. Good for starting out a new view or controller that is very similar to something that already exists in the codebase. Anything remotely novel and it's useless.
I have strange documentation habits and sometimes when you document everything in code comments up front, Copilot does seem to synthesize from your documentation most of the "bones" that you need. It often needs a thorough code review, but it's not unlike sending a requirements document to a very Junior developer that sometimes surprises you and getting back something that almost works in a PR you need a fine tooth comb on. A few times I've "finished my PR review" with "Not bad, Junior, B+".
I know a lot of us generally don't write comments until "last" so will never see this side of Copilot, but it is interesting to try if you haven't.
What it's good for is high level overview and structuring of simple apps, which saves me a lot of googling, reviewing prior work, and some initial typing.
After my last attempts to work with it, I've decided that until there's another large improvement in the models (GPT5 or similar), I won't try to use it beyond this initial structure creation phase.
The issue is that for complex apps that already have a structure in place - especially if it's not a great structure and I don't have the rights or time to do a refactoring - the AI can't really do anything to help. So in this case, for new, simple, or test projects it'll seem like an amazing tool and then in the real world it's pretty much useless or even just wastes time, except for brainstorming entirely new features that can be reasoned about in isolation, in which case it's useful again.
A counterpoint is that code should always be written in a modular way so that each piece can be reasoned about in isolation. Which doesn't often happen in large apps that I've worked on, unfortunately. Unless I'm the one who writes them from scratch.