Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Some tips for evaluating intelligent agents?
2 points by bruturis 8 months ago | hide | past | favorite | 5 comments
A friend of mine is a solo developer, he is creating a big intelligent actors platform using LLMs. I think his platform is overly abstract and use a lot of calls to LLMs. How can one measure the increase in intelligent behavior of this platform versus vanilla GPT4?, I am thinking in same use case that would allow him to show the strength of his idea without having a huge cost.

Edited: googling I found this one (), but don't know about the cost of testing the platform.

() https://openreview.net/pdf?id=zAdUB0aCTQ




What do you mean by an "intelligent actors platform"?


The platform allows the agents to create new tools and organize knowledge. Is about 60000 lines of code, about six months 10 hours/day seven days a week. I hope my friend get the appropriate funding to continue with his project, as karpathy recommend in one of his talk, first get the system to obtain top capacity, then reduce costs.


How do you expect to measure something abstract that is not yet even defined


By solving a hard problem or providing a new way of attacking some problem an agent can show behavior that mimic intelligence. What kind of problem should this system attack?


That is my point - you start from the wrong end. You first choose the specific problem and then measure against that problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: