The audio trouble that's "not clipping" is perhaps a timing screw-up which means the audio frames aren't played back as intended.
Suppose we're working with 1024 frames, ie in stereo that's 2048 x 16-bit samples = 4096 bytes, enough to make about 24ms of audio at 441.kHz and due to a screw-up in the driver or subtle difference in how the hardware works, we're only delivering those frames to the card just after they're needed. Maybe 4 frames of audio are emitted before out 4096 bytes arrive on the card, so those 4 frames are nonsense (maybe left over from previous). This sounds... not great.
In a "low latency" mode we send less data, more often (that's what "low latency" means). Maybe we're doing 64 frames instead of 1024, but still 4 of them are nonsense because they arrive too late. Now, instead of a few frames, it's getting on for 10% of all frames that are nonsense, we can make out a tune but it's going to sound terrible.
The fact that repeatedly restarting sometimes "fixes" it can happen if it alters the relationship between when exactly the frames get sent and when the card asks for them by fractions of a millisecond, so as to adjust that 4 frames "lost". If we get it down to 0 frames "lost" it sounds perfect, 1-2 frames will sound much better than 4, while say 8 frames would sound atrocious.
I've gotten exactly this issue in PipeWire (https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/24...), where it would accidentally start playing audio before it starts writing it, then write one more block of audio the moment the first block was fully depleted, and so on until the delay-locked loop gradually pulls PipeWire back into normal operation. The intended operation was that 1 block of audio was compute and written, then audio playback was started and a second block of audio was immediately computed and written, afterwards waiting for the first block to finish playing before generating a third block. Note that this PipeWire bug has been "fixed" (still behaves oddly in some cases last time I tested, don't know its current status).
I wrote a sound driver for BeOS, however I got strange glitches at what looked like random times.
The driver tried to output samples at a fixed rate, however after a lot of experimenting I found the glitches were caused when multiple interrupts conflicted (ie occurred at the same time) and the drivers was thrown off in it's Wait() timing call.
The answer was instead of waiting for a certain time to output a sample, to instead read what time it was and output the sample for that time-sample.
Worked perfectly, and also as important reduced the CPU load to only a faction of the first method.
I also ran BeOS back in the day when I was still in the Mac cult and it was speculated that it would be the basis for the next MacOS. It was super fast UX, attractive for the time... but there were no applications! Was fun to poke around in, but ultimately had to boot into MacOS to do anything.
Oh man, I couldn't say for sure if it had sound or even what speed it was. I sold it after 6 months and built myself a dual PII-350 machine. That gem of a system ran BeOS exclusively like a bat out of hell.
I had an Athlon 800 machine that I used with BeOS as a daily-use machine. The biggest issue was that it played audio about 10% too fast. BeAmp would let you vary playback speed so I just tuned it to a tuning fork I had lying around. I wonder if that was related to SRS...
That might have just been luck if your 6400 isn’t brittle. In the early 2000s I volunteered in a lab full of 5260s, 5400s, and 5500s and the plastic was already brittle on the older units. The tabs on the motherboard tray were always the first to go.
Suppose we're working with 1024 frames, ie in stereo that's 2048 x 16-bit samples = 4096 bytes, enough to make about 24ms of audio at 441.kHz and due to a screw-up in the driver or subtle difference in how the hardware works, we're only delivering those frames to the card just after they're needed. Maybe 4 frames of audio are emitted before out 4096 bytes arrive on the card, so those 4 frames are nonsense (maybe left over from previous). This sounds... not great.
In a "low latency" mode we send less data, more often (that's what "low latency" means). Maybe we're doing 64 frames instead of 1024, but still 4 of them are nonsense because they arrive too late. Now, instead of a few frames, it's getting on for 10% of all frames that are nonsense, we can make out a tune but it's going to sound terrible.
The fact that repeatedly restarting sometimes "fixes" it can happen if it alters the relationship between when exactly the frames get sent and when the card asks for them by fractions of a millisecond, so as to adjust that 4 frames "lost". If we get it down to 0 frames "lost" it sounds perfect, 1-2 frames will sound much better than 4, while say 8 frames would sound atrocious.