In my GB emulator I generated audio once per frame at the same time I flushed graphics to the screen. That's probably not a completely accurate way to do things, but it worked well enough.
Yeah that generally works and is probably how I should have gone.
Though it breaks down when game devs use exploits for special sound effects. If you only calculate during vblank then you won’t do the calculations against changing memory conditions.