Hacker Newsnew | past | comments | ask | show | jobs | submit | tzafrir's commentslogin

I built a methodology at Fiverr Labs for generating agent prompts from product specs using tests instead of manual prompt engineering. You write a behavioral spec, a coding agent generates tests from it, and a second agent iterates on the prompt until tests pass. Hidden test splits and mutation testing address specification gaming.

Evaluated on 4 agent specs across 24 trials — 92% compilation success, $2–3 per compilation. The benchmark and all code are open at https://github.com/f-labs-io/tdad-paper-code

Happy to discuss the methodology, limitations, and directions for follow-ups


A quick hack I wrote using IDF's data feed.

Other pages using the feed focused on alarming citizens, I made this one to give the feeling of what it's like here.


Thank you.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: