

Confounding, External/Internal Validity - twampss
http://sheddingbikes.com/posts/1280941498.html

======
parfe
I'm glad Shaw wrote something about the response he received here. The
original Poll/Epoll thread was full of garbage posts and "stop energy". A lot
of smug opinions and naysaying without much useful input or information.

And the one smack talker who actually went out and tested a server showed 40%
ATR which I felt supported Zed's point that you might be running a server that
could be hurt by exclusively using epoll.

And the best part is Zed was the one putting in work to try a better way. For
this site to be so full of startup motivation and going out and trying
something new it's hard to see the responses in the epoll/poll thread as
anything but bitter or prejudiced against Zed.

~~~
jacquesm
Thanks, that was me I believe, and wouldn't you believe it, poll is even
better than what Zed measured:

<http://news.ycombinator.com/item?id=1575361>

Unfortunately only if you overlook how that utility was meant to be used,
which is of course a minor detail.

~~~
parfe
I didn't want to call anyone out by name, but yeah :)

I think Zed's (future) attempt at unifying the two polling methods via ZeroMQ
is interesting and if he thinks it will improve his system he should try it.

I severely dislike the idea being pushed in that original thread that polling
was settled and he was wasting his time. Especially the "premature
optimization" crap when Mongrel2 works well enough that he thinks he can spend
time on this detail. I think it's great he's spending his time trying
something new and sharing the results.

~~~
jacquesm
I'm fine with that too, but if you put your stuff out for the world to see
expect to be criticised and learn how to respond with grace instead of with
bile.

A lot of people have been thrown off HN for considerably less than what 'Zed'
gets away with and it seems to me that if you write "I learned the hard way
that programmers listen to the biggest asshole, not the calm reasoned ones."
that you're on the wrong track.

Evidence beats arrogance and bullying any day. So, instead of responding in
kind I decided to respond in code and with some figures as well as an analysis
of what that utility is all about.

Turns out that the way Zed uses it is in no way representative of a real world
situation, even in simulation.

~~~
zedshaw
I'm actually glad that you stopped spreading your FUD and started looking at
it realistically. Thank you for that.

But look at it from my perspective. _You_ seem like the one spreading bile and
being obnoxious because I come to that thread expecting a reasoned debate and
instead I see you replying to every thread that disagreed with you and putting
words in my mouth.

You changed your story multiple times, just quote a numbers without actual
data (although I see you might have some real data this time), and basically
came off like yet another person who's more interested in winning by rhetoric
rather than trying to analyze the problem.

So to me, your behavior made you come off as yet another one of those assholes
who's only goal is to troll. Now that you're actually collecting numbers and
participating in the debate I'll start to listen.

Incidentally, one of my blogs is already banned from HN because they don't
like it. Same as any negative news about Mahalo or other "friends". Eventually
the only thing left here will be a bunch of douchebags and neckbeards fighting
each other for the next YC round of scraps if the trend continues. So, you
should hope that they don't ban more people because trust me, tons of you guys
would get banned hard.

~~~
jacquesm
> So to me, your behavior made you come off as yet another one of those
> assholes who's only goal is to troll. Now that you're actually collecting
> numbers and participating in the debate I'll start to listen.

Let me get this straight, if the numbers I collect aren't to your liking then
I'm a troll and it's all FUD and if I confirm the figures that you come up
with then I'm realistic and you'll start listening?

And you call yourself a scientist?

Come on man, you can do a whole lot better than that.

Those numbers from yesterday are _just_ as valid as the ones that I presented
today, both were measured to the best of my ability and I'm willing to back up
both sets with more research if that's what is required.

You make the classic mistake of attacking the messenger and not the message
when you don't agree with the contents of he message and you do so in a way
that is a combination of handwaving and bullying that is not becoming of you.

You might think that people listen to the greatest assholes rather than to the
people with the well reasoned arguments but you have to ask yourself if you
want groupies and yes-men or people that you can learn from and that can learn
from you as your audience.

Admittedly that might make you less of a 'rock star hero', but it's easy to be
one of those to an audience of wannabes.

You're a pretty good coder, but you're not above making mistakes and in this
case I think you've made several, including assuming that everybody that
criticises you doesn't know what they're talking about. Suprise, some do.

Now you have to go and wonder what else it is that you're missing. If you
program open source code you have to deal with a lot of flak, so I can see
where your defensive response is coming from.

However, the first time I wrote a single-threaded asynchronous polling web
front end (in 1998) I spent an awful lot of time analysing the traffic before
I wrote code and it is based on that and lots of more recent work (such as
building a CDN for a very highly trafficked website) that I criticised your
decision to work on this now as premature optimisation, and all the other
research that I've dug up in the last two days seems to confirm that, at least
as far as I can tell.

So, go do your 'superpoll' thing. But unless you plan on having an open mind
and being in 'receive' rather than 'send' mode all the time you probably
shouldn't call it science.

And then, when you've done your superpoll thing and you test the hell out of
it and you conclude that it was worth it I'll be suitably impressed and I'll
apologise to you for being right where I was wrong. I couldn't care less, I've
been wrong before and even though I have a bunch of data to back up my
statements I still call it 'my opinion' in stead of calling it science.

Note that none of the above is a personal attack on you, so if you could
refrain from further personal attacks on me or anybody else that is critical
of your work (and not of you) then that would be a dramatic improvement.

Oh, and thanks for the work on Mongrel2, and any other work that you've done
in the past on open source projects, I wished I was selfless enough to write
code and throw it out into the world only to get flak in return, I often
wonder how guys like you, Linus, Guido and so on manage to do it.

------
chipsy
People cover their ears when they find out a benchmark is synthetic, because
it goes against cargo-cult "profile then optimize, no premature" type advice.
Never mind that plenty of situations exist where the smallest-factors, one-at-
a-time, scientific way is the best way to get good information. Or that said
advice is based on the case of banging out a readable, working program, not
making the best-performing program in its class.

------
losvedir
So... Does the active/total file descriptor ratio ever go above .6? It looks
like from Zed's tests that below that ratio epoll is invariably faster, and
above that ratio poll is. Pretty straightforward, and Zed did a terrific job
of elucidating the key distinction.

The natural question at this point is whether that ratio ever does go above
.6, and honestly... I have no idea because I don't even really know what
active or total file descriptors are! But it seems like it should be easy to
find out, if people just ran a few tests, yeah?

I only bring this question up because the last section of the article asks why
everyone just assumed epoll was always faster. A potential answer is that if
the ratio never goes above .6 then epoll IS always faster.

In any case, I have no vested interest - as I said I don't even really know
what these things are. I'm just trying to understand the human side about why
people believe what they do.

------
j_baker
"If you ignore them then huge swathes of people will simply blindly believe
anything the trolls say no matter how wrong or weird it is. I guess it's a guy
thing where they'll just believe whoever's tallest, and "tall" on a forum is
who's most obnoxious and writes the most."

This is a textbook example of a vocal minority. And social psychologists know
their effects well: [http://www.spring.org.uk/2007/07/loudest-voice-majority-
opin...](http://www.spring.org.uk/2007/07/loudest-voice-majority-opinion.php)

------
jfager
Zed, would you stop with the strawman bullshit already? Nobody serious ever
claimed epoll was always faster than poll. The very graphs you linked to in
your original post, the ones that you said were "flat out" wrong, themselves
showed that poll was faster with smaller numbers of inactive fds on certain
benchmarks.

And you don't need a benchmark to find it out, either. Just think about it.
Poll doesn't scale because even when you have a ton of dead connections, you
still have to iterate over every single fd that you're tracking when you're
notified of an event. The array that you get back from epoll only contains
active fds. As the active/total ratio goes up, you're eventually going to be
doing just as much work as you would be doing with poll (iterating over every
single fd), except you're also going to have taken the hit of the work to
support epoll itself.

In other words, when ATR reaches a certain threshold, setting aside the
possibility of a confounding (there you go, I used your word du jour)
implementation detail, there's no possible way epoll can be as fast or faster
than poll. It will always be slower. And if you never realized it, it's simply
because you didn't stop and give it 5 seconds of thought, not because you were
lured in by the malicious lies of the ignorant epollers.

The only interesting thing your original post illustrated vis-a-vis poll vs
epoll was what the actual threshold was for your particular machine and
benchmark.

Superpoll is an interesting idea, but there are serious issues with the
approach you described. By adding the overhead of managing the epoll/poll
migrations, you're inherently going to negatively impact the pure performance
of both. I might have read it wrong, but it also sounded like you intended to
start connections off in poll and migrate them over to epoll as their activity
ratio went down. But with that approach, if you're getting spiky traffic that
hits once and then just waits for the connection to time out, you're getting
the absolute worst of poll, and then possibly spending the time to migrate
over to epoll where it's just going to wait to die anyways.

Anyways, getting a negative response is hardly justification for posting
someone's picture and ridiculing his appearance. It's that kind of behavior
that makes people decide you're not worth dealing with regardless of how good
your code is or how smart you are.

~~~
zedshaw
You keep using this word "strawman" but the way you use it is the way
Scarecrow would use it on the Yellow Brick Road.

Otherwise, very good reasoned argument. I'll be trying some stuff out and you
can pitch in comments along this vein then I'll gladly listen. Thank you for
disagreeing with me.

However, when someone says things to me the way blasdel did then, and not
expect me to insult back, then they're in the wrong game. You can't dish it
out and expect to not get the same back. I most definitely expect it back and
take it on the chin frequently. That's also why there's no pictures of me on
the internet in my underwear.

Keep it civil and I keep it civil. Be a jerk and I'll be a jerk, and frankly
that's how the "real world" works. You get what you give.

~~~
jfager
<http://www.logicalfallacies.info/ambiguity/straw-man/>
<http://www.nizkor.org/features/fallacies/straw-man.html>
<http://en.wikipedia.org/wiki/Straw_man>

Saying that epoll's proponents perpetuate a myth that epoll is always faster
than poll is either untrue or based on a biased selection of mistaken,
untrustworthy epoll proponents. That's a classic strawman.

blasdel was out of line, but it was just words, it was on topic, and it was
buried in a comment thread. By your own standard of "you get what you give",
your response was disproportionate.

Anyways, like I've said, I'm looking forward to seeing how your poll/epoll
hybrid works out, and to being proven wrong. I'll be glad to offer up any
thoughts as you get it working.

