How does the buffering proceed quickly, but then the (slower than usual) periodic calls only copy a limited amount of data? That's the part that doesn't make sense to me. Normally if you're filling buffers, you copy as much data as is needed to keep the buffer full, no matter when you're invoked.
The handler always copies the same amount of data. It has to be called more often to copy a lot of data, which is what usually happens when Android buffer isn't full. I agree with you that this isn't usually how this is done. Doing it this way means that Android has more freedom to schedule other tasks during the buffering phase.
Okay, so the real problem here is there was effectively a yield() inside a buffer fill loop with a small block size, and Android decided to make each of those cost 40ms. The article made it sound like you were always waiting 15ms, but you were actually waiting 0ms when there was more work to do.
I still think the blocksize should've been larger for various reasons in this case (it's a trade-off still, usually larger block sizes are mildly more CPU-efficient, besides preventing pathological scheduling cases like this one) but this explanation makes more sense.
Yes, that's all correct. The reason for the small block size has to do design decisions made in the rest of the streaming stack, which is shared between all TV devices.
That is indeed interesting. The pipeline should be filling up if the OS scheduler delays the buffering for 40ms, giving enough data on the next call. Except if the buffering isn't greedy enough or aborted by the OS in some way.