what would be an acceptable amount of cache storage for such an app, would you think?

because one thing I don't like about streaming is that the content can be gone next time you want to hear it. I use youtube-dl to save videos (or often just the audio stream) because of this.

If you leave out the video stream, they're quite tiny (in context of modern storage). A quick scan of my saved .opus files, shows the same rough ~1MB/min filesize that 128k mp3s have (except with way better quality of course).

For simplicity, I wouldn't cache anything. Run youtube-dl with the JSON dump option to get all the sources, find the best source, put the URL into a <audio> element and that's it. At least that's how I'd do it.

