You just have to use select() correctly:
1) You can raise the 1024 limit of feed set size by "#define FD_SETSIZE 65536" (required for SunOS to use select_large_fdset() instead of select()) and allocating memory for fd_set yourself.
2) Do not loop over descriptors and use FD_ISSET to check if file descriptor is in set. Instead loop over fd_set one word at a time: if word != 0 then go and analyze each bit of that word (see how Linux kernel does it).
3) The other thing is to limit number of select() calls you make per second and do short sleeps if needed. That allows for events to be processed in batches and the cost of select() calls gets relatively smaller compared to the "real work" done. It also increases latency but you can work out a reasonable number of select() per second. This idea I got from "Efficient Network I/O Polling with Fine-Grained Interval Control by Eiji Kawai, Youki Kadobayashi, Suguru Yamaguchi"
I learned how to use accept() correctly from "Scalability of Linux Event-Dispatch Mechanisms by Abhishek Chandra, David Mosberger". The main idea is to call accept() in a loop until EAGAIN or EWOULDBLOCK is returned or you have accepted "enough" connections.
I don't get why author claims that epoll() fixes the problem with registering and unregistering descriptors. If you use epoll then adding or removing a descriptor is a system call but in case of select() you just modify a local data structure and call select() when you're done adding and removing all of descriptors. And you shouldn't call accept() from multiple threads, a single thread calling accept() is enough for most of us unless you're web scale ;-D