Looks really cool! Curious, do you have any thoughts/advice on how to improve agent reliability? I usually run into a lot of inconsistency when I need it to execute a workflow at even small scale
Also how do you guys think about multi agent workflows? i.e. having a couple agents take actions in parallel. Wondering if its possible to have two share a vm.
Yep, we've also run into inconsistency issues when trying to build with these agents. The biggest thing we've seen help is by breaking the task down into smaller actions, effectively writing a script for the agent (e.g., go to google.com. type 'hello world', etc). The more loaded the prompt, the more off the rails it might go. We want to create more tools to help with reliability/this inconsistency, but it's also something that I hope improves relatively soon from the foundational model companies investing more here
In terms of multi agent workflows - it's something we've been thinking about! We especially think this could be especially helpful when filling out a form to speed things up even more. It's hard for me to think of other use cases though where multiple agents might need to share a vm (as opposed to just spinning up another vm with another agent), but curious to hear your thoughts!
Also how do you guys think about multi agent workflows? i.e. having a couple agents take actions in parallel. Wondering if its possible to have two share a vm.