Hacker News new | past | comments | ask | show | jobs | submit login

There's one huge difference between Copilot and stuff like this. Art that's 98% correct is awesome. Code that's 98% correct is completely useless.

I think Copilot is going to live off hype for a while then tank and be looked back on as a failed experiment. Whereas I think that this kind of AI will eventually get to a point where it's extremely useful and could change up certain industries (game assets, marketing materials etc).




As a new user of copilot for the last three months, I can't disagree more. I was initially skeptical, and I have noticed it often produces code that looks good but is wrong. However, it still saves me enough time each day to pay for the monthly subscription - in one day. That's a 30x ROI. I imagine it only gets better from here. I wont go back to programming without it.


Can I ask what kind of programming you do for it to be so helpful?

I mostly do maintenance of legacy codebases (also known as codebases, lol) where a lot of the work is figuring out where the changes need to be made and actually making the changes is frequently just a few lines here and there.

When I do have to figure out how to use some API, it's often not an open source one, so Copilot would not have it in its corpus.

I think these kinds of conditions are really common since software tends to last for maintenance longer than it is in initial greenfield development.

So I'm confused what kind of work benefits from Copilot. Just pumping out greenfield development of new websites/webapps that don't use much legacy or closed source code or services, just using existing popular open source libraries in commonplace ways?

The other thing I wonder about is code quality. When I look up API docs and stackoverflow examples, I get to read them all, maybe test some examples out in a CLI/REPL, and then decide carefully exactly what to do myself, what special cases to worry about or not, what errors to handle, etc.

Maybe what I end up writing is even the same as Copilot would have written. But in the process, I learn about finer details of the library and make detailed decisions about how to deal with rare edge cases. Might even end up writing a comment calling out a common pitfall I realized might exist.

My question is -- in order to save so much time with Copilot, are you still able to do all this extra thinking and deciding and learning (in cases that warrant it)? Or would doing that just end up consuming most of the time Copilot "saved"?

In other words do you end up producing code much more rapidly, but at the expense of code that looks more like a junior than a senior wrote it, because it is most concerned with working at all, and does not have time to worry about finer details? At the expense of not being as deeply familiar with the foibles of the API you're working on?

Honest questions as I haven't tried Copilot, and these are the thoughts that make me imagine it won't be of value. A lot of what I know I learned from doing the parts of the work that Copilot would be automating. Sure, Copilot would save me time when initially writing it. But would I then have less deep knowledge available when there's a fire in production because I never explored the fine details of my dependencies as much?


It's really more like a smarter autocomplete. I haven't tried it on a third party API yet, we don't use many at work. I work in a startup on a Python and TypeScript code base. To give an example, last night I was creating a unit test and copilot filled in the assertions. It missed one it couldn't know about, and it got two wrong. But it was a lot faster. The most amazing case to me was with a function to transform URLs to an image resize service. There was a bug in the function, it needed to return URLs ending in .SVG as-is. I went to fix the bug by typing "if" and copilot filled in the `if url.lower().endswith(".svg") return url`. It knew about the bug before I did. Too bad it couldn't do a code review when I originally wrote the function.

They have a 60 day free trial. Try it out, it's one of the most interesting changes in developer tools in a while. I feel like I'm living in the future sometimes when using it.


I haven't tried copilot either, but one of the things I'd be curious about is how well it can conform to a company's coding style guidelines and/or match its coding style with the existing legacy code that's being modified.

One of the major annoyances of working as a team with legacy code is when someone forgets to, or deliberately avoids, conforming their code to the style and techniques of the surrounding code. Nothing grinds my gears like working in a 500 line C++ file, where_every_function_uses_underbars, has consistent 4 space indentions, avoids exceptions, and passes by reference, but right in the middle is that functionThatMiltonWrote that uses camel-case, has 8 space indentions, throws exceptions, and passes-by-pointer.


It works great in our codebase; it uses the current text in the file from above your cursor as reference. So if you're creating a new file, it isn't always perfect, but once it catches on to your style it's seamless.


I haven't tried copilot in languages less opinionated about style, so I'm not sure how well it handles that case. In Python, TypeScript, and rust it seems to work well. In any case I use auto format on save, so that fixes some of the potential formatting issues.


That's an issue but it seems fixable.


It seems like you haven’t used CoPilot. Yeah, some of the harder bits I may have to code myself but the amount of boilerplate it reduces is incredibly liberating.


>Code that's 98% correct is completely useless.

Whoa! Really? Seems like modern software releases would be excited to achieve 98% with the incredible amount of bug fixes/patches released very quickly after the massive beta test known as release day.


Code has to be 100% correct or else it’s considered a bug (assuming it’s syntactically valid).

Code that is 98% correct is actually much worse than no code at all. That’s the kind of code that will introduce subtle, systemic faults, and years later result in catastrophic failure when the company realizes they’ve been calculating millions of payments without sales tax, or a clever hacker discovers they can get escalated permissions by passing a well-crafted request etc.


There's also unintended behavior in more subtle ways. Ideally stuff can be formally verified, but that's not practical for most things.


Won't fix. Working as expected. <ticket closed>

That wouldn't be a meme if it was something that doesn't happen


If the 98% correct code works, and provides value then it is better than no code at all, yah?


You are looking at code like a production line. It's a semantic construction.

Code 0.1% wrong, sends you to the Sun instead of the Moon, debits your account instead of crediting...


What's your metric for the "percentage the code is wrong"? Is it how many lines of code were wrong, or how many test cases the code fails?

Presumably if AI-generated code passes every test case, but would fail on edge cases that some human programmer(s) did not anticipate in their suite of tests, the humans potentially might have made similar coding mistakes as the AI if they had had to personally write the code.


For AI to generate code for a test case we would need AGI.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: