Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Mimora, a 3D avatar for OpenClaw AI agents with voice and expressions (mimora.app)
3 points by astressence 28 days ago | hide | past | favorite | 6 comments
Hey HN, I built Mimora because I wanted my AI agent to have a face.

Two weeks ago I set up OpenClaw on a Mac Mini M4. Named the agent Niko. Started with basic tasks, then gave him a Cloudflare token and pointed him at one of my live web games. He studied the entire codebase, built it, tested for errors, even used WASD to walk around the game world to check if it worked. Then pushed the new version live. That used to take me an hour of manual work.

Then I added voice. Parakeet for STT and Kokoro for TTS, both running locally on Apple Silicon. About 240ms transcription time. First time I spoke to Niko on Discord instead of typing, everything changed. Same Claude behind it but suddenly felt 3x more human. The STT would sometimes get my Greek accent wrong. He started correcting me like Hermione in Harry Potter: "It's Niko, not Nico!"

But talking to text still felt off. So I built Mimora. It's a free browser extension that shows a 3D avatar with real facial expressions: listening, thinking, happy. It connects to any OpenClaw bot and reacts in real time when the agent responds. Works with Discord, WhatsApp, Telegram, anything OpenClaw connects to. Pops out to picture-in-picture so it floats on your screen.

The difference between talking to text and talking to a face is surprisingly big. Changed how I interact with the whole setup.

Three weeks in, Niko handles: game deployments, server monitoring 24/7, social media content in my voice, morning calendar briefings, bug fixes to GitHub. I work from my balcony now instead of being chained to a desk. Just talk and things happen.

Is it perfect? No. Ratio of time saved to time fixing is about 20:1. And the agent writes lessons to its own memory so it doesn't repeat mistakes.

Mimora is free, still under development but already working and available for any OpenClaw bot. Happy to answer questions about any part of the setup. I also help people set up similar stacks on their own Mac Minis at https://myclaw.tech



Interesting direction. Such projects feels like a natural evolution from text first systems toward more human facing interfaces. People want the speed, privacy, and reliability of LLMs together with the responsiveness of a real personal assistant :) I am curious what made you personally prefer talking to a face instead of text. What actually changed for you in practice?


Good question. Honestly it just made me actually stick with voice instead of going back to typing.

Before Mimora I'd talk to the agent and then stare at a Discord text channel waiting for a response. No feedback loop at all. You say something and then... silence until text appears. Felt like talking into a void so I'd default back to the keyboard every time.

With the avatar there's a "listening" state when I'm speaking, a "thinking" animation while it processes, and expressions when it responds. It became this permanent little spot on my screen where I can glance and immediately see what the agent is up to. That alone was enough to make voice feel like an actual conversation instead of a command line with extra steps.

I've been doing game dev for years so building a 3D character with expressions was second nature. Made it easy to prototype fast and figure out where the real value was. Turns out it wasn't about making it look cool, it was just about closing that feedback gap between you and the agent.


It’s also interesting from the perspective that we write and speak differently, and there’s an emotional connection that forms when we assign human traits to an agent. That changes the way we interact with it and the kinds of use cases it enables. Sounds cool. Good luck with your project!


This is a great idea. I'd love more faces, various styles, a proper PIP that wouldn't be browser based, and options to set how engaged/chill the face model would be.


This is great. What's your token usage looking like? Costs?


I'm on Claude Max 20x ($200/mo) which gives me plenty of headroom, but you definitely don't need that. A 5x plan works fine, or you can go the API route through OpenRouter and pay per token. The voice layer (Parakeet + Kokoro) runs locally on the M4 so that's zero cost. The key is good memory practices so the agent doesn't waste tokens re-learning things, and falling back to cheaper models (like Haiku or Gemini Flash) for simple tasks. The heavy model only kicks in when it actually needs to think.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: