Hacker News new | comments | show | ask | jobs | submit login

While I don't doubt that TIME's poll security team (if it existed) was more than overmatched, how could a website defend the integrity of their online poll against such an attack?

Or is running an effective online poll truly hopeless?




defend the integrity of their online poll

I'm coming up with all sorts of similes but they all sound snarky, and I don't want to be snarky, so I'll just say it straight: there is no "integrity" in an online poll.

The results are always stunningly, catastrophically, inarguably invalid for any sort of rigorous use. The only thing that makes this particular poll more obviously flawed than the Ron Paul surges which were more obviously flawed then the garden variety online poll is that the latent vulnerability was exploited to an extent approaching parody.

(Note you don't have to have an adversary at all to make an online poll invalid. They're always the result of self-selection on the part of the participants anyhow.)


Actually, online voting may have some integrity if you find the voters, instead of letting the voters find you. If you had some kind of reliably random population, you can simply select X members to vote, thus ensuring that the stats are relatively bias-free (you get bias from people who abstain).

But you're right - in any case where the voters find you, your results will be trash.


Sure, but the whole concept of "online", as people understand it, is "clients wander around doing to servers whatever they damn-well please." If you're pulling in voters, there's no difference between doing it online and doing it, say, over-the-phone or by-mail or door-to-door—so you drop the distinguishing "online" when explaining it.


That seems like a distinction without a difference. Are you saying Facebook old poll feature wasn't "online?"


Correct, for example HotOrNot voting. If you eneter a profile url directly, you can vote but it would not be counted. If you are sent randomly to a profile(by selecting "next random" button), then there is no self selection bias and your vote is counted.


What I meant by integrity was to have the results be more or less representative of the actual beliefs of an average site visitor.

I'm guessing some kind of statistical method for determining which votes don't fit the profile of a site's visitors combined with actively weeding out obvious instances of mass voting could make the results at least appear more accurate.

Sure there's no actual validity or rigor to online poll results, but the point is more to have results that at least appear plausible.


But then why bother having the poll in the first place? If you're just going prune the results so that they look like what you expect, you aren't really polling. You already know the answer, and you're going to throw out data until you get it.


Note that the chief engineer from reCAPTCHA offers a comment on the blog. He indicates that rC is intended to be "only one element" in a defense against attacks. (and seemed good-spirited about the cake-in-the-face of the whole thing).

Seems to me the issue raised is about the "integrity" of online/offline "journalism" (of Time) in not acknowledging the meaninglessness of the poll results (or even the fact they were badly hacked). [ Maybe that's for Newsweek to report?]


Why not only allow one vote per IP? It would be possible to spoof your IP, but still, could all of the manual 4chan voters spoof their IPs for every vote?


They could use a proxy to "spoof" their IP. But there is no known way they could use IP spoofing to use any old IP address, as the voting app runs via HTTP, which runs over TCP, which requires a full connection, and the known spoofing attacks on TCP are blind, e.g. you can send but not receive data. So HTTP would not work over blind TCP spoofing.

I think that if one vote, or any small number of votes were allowed per IP, the attack would have been much more difficult, as there simply are not tens of thousands of readily available proxies, unless these people have access to a big botnet.

A downside to one vote per IP is that AOL and some organizations place their outgoing web traffic behind one or a small pool of IP addresses. So these users wouldn't have been able to vote.


> A downside to one vote per IP is that AOL and some organizations place their outgoing web traffic behind one or a small pool of IP addresses. So these users wouldn't have been able to vote.

That would not have been such a big problem. But be sure to play 'dead man' and maintain the illusion that every vote counts.

Eg here on Hacker News after you click on the vote-arrows Javascript manipulates the counts accordingly, but did you ever check whether your vote has had any effect on the "true" counts in the server? (Of course at Hackers News it has, because PG is not evil.)

Even more devious would be accepting the unwelcome votes, but also reversing each one of them after a random time has passed. This way the attackers get the see illusion, that their attacks succeed, but are fought back (or drown out in counter-votes from real people) only a few hours later.


Sometimes your vote does not have an effect on the "true" count on the server. For example, try voting every comment on a page down, and then reload to see the real counts. This isn't "evil" per se.


This would potential block voters from DSL and cable modem accounts who use a small pool of shared IPs via dynamic reallocation. This would also mess up office networks using NAT. Both these effects would seriously bias any poll...


But probably less than a purposeful attack.


NAT


There's no way (if there's no offline part)

At least I spent a few minutes here and there since yesterday thinking how it could be secured. Any method I thought was quickly demolished by a few attacks that would work.

But I am open to be corrected! If anyone thought they could have solved this problem, please reply :)


It's not as though you were getting anything statistically valid out of a self-selected sample anyway.

Of course, if you were only polling existing users, you could limit voting to those users who were there before you started the poll.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: