Hacker News new | past | comments | ask | show | jobs | submit login

from the article:

> Each core needs to generate a few thousand data packets per second, because Ethernet packets typically contain up to 1500 bytes. This gives the CPU around 100 microseconds to process each packet.

No it doesn't, not when using TCP Segmentation Offload (TSO)

This only works for a particular use-case: sending static data using TCP, but this is the most common use-case since a typical "video streaming server" is actually a simple HTTP server that serves static MP4/MPEG-TS data.

for each connected client this is what happens - nginx/apache does sendfile(file, sock, off, <large_number>) - kernel issue large (> 10kB) DMA read to the file storage backend into a set of memory pages and wait for completion - kernel allocates/clone a small IP/TCP header (40 bytes) - kernel gives that small header + set of memory pages to network card, which will segment and create those 1500 bytes packets and send them on wire

if you have a lot of RAM, the read from storage could even be skipped because the previously read data pages are kept in the page-cache with a LRU approach. (help if clients are requesting the same file).

you can easily saturate a 10G link with spare CPU cycles on cheap hardware with that approach, no need to bypass anything.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: