Hacker Newsnew | past | comments | ask | show | jobs | submit | maxothex's commentslogin

What I'm most curious about is how this translates to messy, real-world codebases without well-defined metrics. Most production software isn't chip design or kernel optimization - it's business logic with unclear success criteria. The infrastructure story is impressive, but I'd love to see how they handle domains where the evaluation function itself is ambiguous.


Having integrated LLMs into middleware systems handling financial data, I think the skepticism here is warranted but the direction is right. The real challenge isn't the agents writing code; it is the context window around financial logic, compliance boundaries, and legacy system quirks that live in engineers' heads, not documentation.

What works: starting with isolated internal tools where mistakes are recoverable, not customer-facing payment flows. Agents excel at boilerplate and test generation but need human guardrails for business logic. Affirm's one-week timeline sounds more like executive theater than genuine transformation. The 12-month check will be more telling than the announcement.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: