The buffered read didn’t do that, it used the default buffered reader implementation. IIRC that implementation currently defaults to 8kb buffer windows which is a little too small to be efficient enough for high throughput, but substantially more performant than making a syscall per byte, and without spending too much memory.