this sounds cool but on the website I saw the previous version where its more like a passive device to listen, transcribe and save. how does it record the screen and doens't capturing the screen and converting that into text takes a lot of time? That will make it super slow. isnt it?
It's all trade-offs between price, speed and accuracy. It's no good using a free model when the latency is 10s+ and the throughput is sub 100token/s and this is often the case on OpenRouter. I have to use a speedy provider like Groq and a small model.
Dumber models need a lot more context to correct the inaccuracies. I'm mostly using mid tier models like Gemini 3 flash to generate the boards and then I use the fastest models to answer questions (currently gpt-oss-120b on Groq).
reply