Hacker Newsnew | past | comments | ask | show | jobs | submit | FailMore's commentslogin

Any ideas how to solve the agent's don't have total common sense problem?

I have found when using agents to verify agents, that the agent might observe something that a human would immediately find off-putting and obviously wrong but does not raise any flags for the smart-but-dumb agent.


To clarify you are using the "fast brain, slow brain" pattern? Maybe an example would help.

Broadly speaking, we see people experiment with this architecture a lot often with a great deal of success. A few other approaches would be an agent orchestrator architecture with an intent recognition agent which routes to different sub-agents.

Obviously there are endless cases possible in production and best approach is to build your evals using that data.


Only solution is to train the issue for the next time.

Architecturally focusing on Episodic memory with feedback system.

This training is retrieved next time when something similar happens


Training is an overkill at this point imo. I have seen agents work quite well with a feedback loop, some tools and prompt optimisation. Are you doing fine-tuning on the models when you say training?

Nope - just use memory layer with model routing system.

https://github.com/rush86999/atom/blob/main/docs/EPISODIC_ME...


Memory is usually slow and haven't seen many voice agents atleast leverage it. Are you building in text modality or audio as well?

Looks interesting, thanks.

Have you learnt anything from looking at the API calls?


That's only visible when it's not your personal conversation (you can't interact with someone else's). In a way it's designed to be distracting so you know how to start your own conversation.

The building of the visualiser was less interesting to me than the result and your conclusion. I agree that finding new ways to ingest the structure and logic of software would be very useful, and I like your solution. Is there a way to test it out?

Yes, at the moment it's an issue of cost. I can't use the best models because it is not affordable. Hopefully as performance improves over the years this will become less of an issue. Maybe I can build in a websearch to verify info though...

I hear you. Yes, I think "seeding" an LLM with docs or other learning material is one of the fundamentals of effectively and efficiently using it for learning, maybe you can build more in that direction?

Yeah I actually started there. Https://dev.rebrain.gg has the old version up.

You upload a source and it generates questions from it. However when showing it to friends I found that the barrier to usage was too high as most people don’t have a source ready. But I think adding it as an option would be pretty cool and doable


Yeah, I like your idea. Will implement

Hey, thank you so much for the detailed feedback.

Re the more focused feedback, I totally agree re the questioning styles. In the prompt I ask for it to not do so many multiple choice questions, but I think it is addicted/the conversation history skews the context.

I'm going to introduce a settings panel (easily accessible during the conversation), which will let you move to "chat mode" (to discuss instead of be asked questions), and also to configure the types of questions you're asked and the ratio (if I can get the llm to oblige). I'm also going to see if I can come up with some different question formats beyond multiple choice, free-form and multi select (which the llm doesn't use too much).


You have to create a conversation for yourself. Currently you can't interact with other people's conversations. Sorry this wasn't clear enough!

No passwords FYI - I hate them too. Google sign in or "magic link". I know not everyone likes them, but trying to keep friction as low as possible


Hey, I appreciate your views (both positive and negative). There actually is a bit of a back story behind it all. I spent a month working on a different version of rebrain, (still available here: https://dev.rebrain.gg). After putting that work in I started to show it to some friends. It was clear from how they responded that it was not as promising as I had hoped. I listened to their feedback and came up with a different idea for LLM-based education (which is the version that's live now). I did vibe code it in about 3/4 days. But I vowed to try to get feedback sooner rather than later, which is why I posted it on ShowHN pretty early in its development. I do want to improve it... so please let me know if there are things which really frustrate you.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: