Hacker News new | past | comments | ask | show | jobs | submit login

Have you considered giving your digital twin a jolly aspect? I've wondered if an AI video agent could be made to appear real time, despite a real processing latency, if the AI were to give a hearty laugh before all of its' responses. >So Carter, what did you do this weekend? >Hohoho, you know! I spent some time working on my pet AI projects!

I wonder if some standard set of personable mannerisms could be used to bridge the gap from 250ms to 1000ms. You don't need to think about what the user has said before you realize they've stopped talking. Make the AI Agent laugh or hum or just say "yes!" before beginning its' response.




I think I recall that Google did exactly this with their telephone bot (Google assistant?), sneaking in very natural sounding "um"s here and there to mask processing/network latency.


That's actually... clever and fair enough. That's what we use them for, too.


This is definitely a good idea! I think the hard part is making it contextual and relevant to the last question/response, in which case the LLM comes into the equation again. Something we're looking at though!


Perhaps use a small, fast LLM to maintain a rolling "disposition" state, and for each of perhaps a handful of dispositions, have a handful of bridging emotes/gestures. You can have the small LLM use the next-to-last/second-most-recent user input to control the disposition async'ly, and in moments where it's not clear just say "That's a good question," "Let me think about that," or "I think that..." etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: