Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, it doesn’t fix the issue at all. Rough to have a security product demo be fundamentally insecure.



I'll respond to this comment to provide a general response for all of the sub-comments here.

As I highlighted in my post, LLM's generally are still not in a position to replace a developer for more complex tasks and refactoring. We're in the early days of the technology, but we are seeing extremely strong improvements in it over the last year. We on the team have QA'd thousands of results for public, and private repositories. The private ones are particularly interesting because the LLM's do not have that in their corpus, and have seen very strong fix results.

Most people just assume we're wrapping around an LLM, but there's a lot that goes underneath the hood that needs to happen to ensure that fixes are going to be secure and correct. Here are the standards we're setting for fix quality:

- The fix needs to be best-practice and complete. A partial security fix isn't a security fix. This is something we're constantly working on. - Supporting the widest coverage in CWE's.

- Not introducing any breaking changes in the rest of the code. - Understanding the language, the framework being used, and any specific packages. For example, fixing an CSRF issue in Django is different than Flask. Both are python frameworks but approach it differently. - Reusing existing packages correctly to improve security and if it does need to add a package does so in a standard way. - Placing imports in the correct part of the file. - Not using deprecated or risky packages. - Avoiding LLM hallucinations. - Ensuring syntax and formatting are correct. - Follow the coding and naming convention in the file being fixed. - Making sure fixes are consistent within the same issue type. - Explain the fix properly and clearly so that someone can understand it. - Avoiding assumptions that could cause problems. - Not removing any code that is not part of the issue.

Our goal is to get to 90% - 95% accuracy in fixes this year, and we're on a trajectory to do that. I will be the first to say 100% accuracy is impossible, and our goal is to get it right more times than engineers would.

We take fix quality and transparency extremely seriously. We'll be publishing a whitepaper showing the accuracy in results because it's the right thing to do. I hope this helps.


LLMs writing code are fundamentally insecure. This product is completely batshit insane and I'd fire any vendor I knew used it.


Agreed that there’s no way to do this meaningfully and securely.

Looking forward to the archeological audits of LLM-developed apps x years from now that are a total mystery to the product owners…


Thanks for commenting. We're always trying to learn more and iterate to make Corgea better. How should've the fix looked like?


If you don't know that - or rather, if nobody on your team recognized this issue and brought it up - you should not be selling and shipping this product.


Oh my amen.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: