Show HN: Monika–Open-Source AI Assistant with Whisper STT and Emotional TTS

Hi HN,

I'm sharing Monika, an open-source AI assistant I've built. The main focus was on leveraging local processing for the speech components to enhance privacy and create more natural-sounding interactions.

Key components: * Speech-to-Text: Uses OpenAI's Whisper running locally. * Text-to-Speech: Uses RealtimeTTS with the Orpheus model for emotional expression, also running locally. * NLP: Uses Google Gemini on the backend

It includes Voice Activity Detection (VAD) and a basic web interface using Flask. The idea was to see how well local STT and expressive local TTS could work together for a conversational agent.

Tech stack: Python, Flask, Whisper, Gemini, RealtimeTTS.

Video Demo: [https://www.youtube.com/watch?v=_vdlT1uJq2k]

The project is MIT licensed. I'd appreciate any feedback, thoughts on the approach, or suggestions you might have!