I’ve never worked with formal validation (barely remember my CS course). This release looks impressive. But I'm trying to wrap my head around the near-term practical applications for everyday software.
Right now, we see a lot of business experts in enterprises tempted to use AI to impl. business logic so they don't have to wait for (or pay) software experts.
Would this kind of technology help these users any time soon?
My current theory is that the real breakthrough for these non-developers will only happen when they can actually verify the result themselves without needing an another expert in the loop. But I don't see that with formal validation anytime soon.
Besides the aspect of self improvement I’m impressed with the ability to put a language specification into 3,200 token. This is like a micro-kernel which extends itself.
Now I wonder, shall we as humans still be able to read and review this code or is the sandbox + permission model as last line of defense sufficient?
The 3,200 token specification is not a formal one, I wonder whether having it formalized into a real grammar would help LLMs to generate code?
There seems to be some evidence that literate programming style comments help humans to comprehend code they don't know. I found a paper investigating this.
Some folks from Google tested 1) how good LLMs can update existing code with LP style comments and 2) if that helps humans to better understand that enhanced code.
(see the 2024 arXiv paper "Natural Language Outlines for Code").
If I remember correctly they had systematically tested how good humans could understand the enhanced code compared to no comments at all and they also tested different flavours of comments (line level, block level etc.).
The conclusion was: if you use the right amount of comments in the right style (intent explaining the purpose of the code on block level, not every line), it's very beneficial.
Right now, we see a lot of business experts in enterprises tempted to use AI to impl. business logic so they don't have to wait for (or pay) software experts. Would this kind of technology help these users any time soon?
My current theory is that the real breakthrough for these non-developers will only happen when they can actually verify the result themselves without needing an another expert in the loop. But I don't see that with formal validation anytime soon.
Do I overlook something?
reply