No problem! There are other ML TTS models that are both lightweight and can run ...

No problem!

There are other ML TTS models that are both lightweight and can run on a CPU. Check out Glow-TTS for something that will probably work.

Also swap out the HifiGan vocoder for Melgan or MB-Melgan as these will also better support your use case.

I ran this exact setup on cheap Digital Ocean droplets (without GPUs) and it ran faster than real time. It should work on a Pi.

Unfortunately I'm not aware of STT models that operate under these same hardware constraints, but you should be good to go for TTS. With a little bit of poking around, I'm sure you can find a solution for STT too.