Hacker News new | past | comments | ask | show | jobs | submit login

“Crafting a better prompt” is often simply spinning an RNG again and again until you end up with an answer that happens to be good enough.

In the real world if you know the correct answer you don’t need to ask the question. A self driving car that needs you to pay attention isn’t self driving.

Any system can get canned response, the value of AI is completely in its ability to handle novelty without hand holding. And none of these systems actually do that even vaguely well in practice rather than providing response that are vaguely close to correct.

If I ask for a summary of an article and it gets anything wrong in the article that’s a 0 because now I need to read the article to know what it said. Arguably the value is actually negative here.




Any time prompt crafting matters is just when demonstrating the current edge of capabilities - next iteration, you can get away with a much more general/primitive prompt. Those are just people countering the "gotcha" arguments people try to levy against LLMs, showing that even now those tasks can be done with a good prompt. Anytime it's a practical concern though - just wait a little longer for the next model to smooth that out.

You don't have to pay attention, that's the point. You can code without reading code now. Sure you gotta tell it what the app looks like with each iteration - but again, that's temporary til the next model comes out with good enough vision to assess that itself. None of this is permanently planning on requiring human interaction - it's just early days and these are progressing through mediums one at a time.

They're not canned responses either. They're bespoke mixtures of all the various elements of the current environment/context translated to an answer. It certainly handles novelty - that's the whole point. They certainly handle plenty of novelty - like entire mediums of text and images - to expert levels. I think you're just being greedy for more, here.

As for consistency and avoiding error? There are benchmarks for that. There are error checking methods. Those are all steadily improving too, and are already well-consistent on easier topics/mediums. It would be foolish to think that's innately impossible from AI for remaining ones.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: