I'm coming up with all sorts of similes but they all sound snarky, and I don't want to be snarky, so I'll just say it straight: there is no "integrity" in an online poll.
The results are always stunningly, catastrophically, inarguably invalid for any sort of rigorous use. The only thing that makes this particular poll more obviously flawed than the Ron Paul surges which were more obviously flawed then the garden variety online poll is that the latent vulnerability was exploited to an extent approaching parody.
(Note you don't have to have an adversary at all to make an online poll invalid. They're always the result of self-selection on the part of the participants anyhow.)
Actually, online voting may have some integrity if you find the voters, instead of letting the voters find you. If you had some kind of reliably random population, you can simply select X members to vote, thus ensuring that the stats are relatively bias-free (you get bias from people who abstain).
But you're right - in any case where the voters find you, your results will be trash.
Sure, but the whole concept of "online", as people understand it, is "clients wander around doing to servers whatever they damn-well please." If you're pulling in voters, there's no difference between doing it online and doing it, say, over-the-phone or by-mail or door-to-door—so you drop the distinguishing "online" when explaining it.
Correct, for example HotOrNot voting. If you eneter a profile url directly, you can vote but it would not be counted. If you are sent randomly to a profile(by selecting "next random" button), then there is no self selection bias and your vote is counted.
What I meant by integrity was to have the results be more or less representative of the actual beliefs of an average site visitor.
I'm guessing some kind of statistical method for determining which votes don't fit the profile of a site's visitors combined with actively weeding out obvious instances of mass voting could make the results at least appear more accurate.
Sure there's no actual validity or rigor to online poll results, but the point is more to have results that at least appear plausible.
But then why bother having the poll in the first place? If you're just going prune the results so that they look like what you expect, you aren't really polling. You already know the answer, and you're going to throw out data until you get it.
Note that the chief engineer from reCAPTCHA offers a comment on the blog. He indicates that rC is intended to be "only one element" in a defense against attacks. (and seemed good-spirited about the cake-in-the-face of the whole thing).
Seems to me the issue raised is about the "integrity" of online/offline "journalism" (of Time) in not acknowledging the meaninglessness of the poll results (or even the fact they were badly hacked). [ Maybe that's for Newsweek to report?]
They could use a proxy to "spoof" their IP. But there is no known way
they could use IP spoofing to use any old IP address, as the voting
app runs via HTTP, which runs over TCP, which requires a full
connection, and the known spoofing attacks on TCP are blind, e.g. you
can send but not receive data. So HTTP would not work over blind TCP
I think that if one vote, or any small number of votes were allowed
per IP, the attack would have been much more difficult, as there
simply are not tens of thousands of readily available proxies, unless
these people have access to a big botnet.
A downside to one vote per IP is that AOL and some organizations place
their outgoing web traffic behind one or a small pool of IP addresses.
So these users wouldn't have been able to vote.
> A downside to one vote per IP is that AOL and some organizations place their outgoing web traffic behind one or a small pool of IP addresses. So these users wouldn't have been able to vote.
That would not have been such a big problem. But be sure to play 'dead man' and maintain the illusion that every vote counts.
Even more devious would be accepting the unwelcome votes, but also reversing each one of them after a random time has passed. This way the attackers get the see illusion, that their attacks succeed, but are fought back (or drown out in counter-votes from real people) only a few hours later.
This would potential block voters from DSL and cable modem accounts who use a small pool of shared IPs via dynamic reallocation. This would also mess up office networks using NAT. Both these effects would seriously bias any poll...