Production failures due to bugs in both TCP-based servers and clients happen all the time, long time ago I invested a little time to learn basic socket programming in C and the ROI on this has really been spectacular. "Unix Network Programming" by Stevens is a great book on this topic, in particular it carefully discusses all the different ways things can go wrong. Wish there was something more up to date that could be recommended, but I don't know anything of comparable quality, so I think you still have to read it and later work out all the Linux-specific details, more modern APIs like epoll etc.

