
Tor Users Might Soon Have a Way to Avoid Those Annoying CAPTCHAs - walterbell
http://motherboard.vice.com/read/tor-captchas
======
mrb
Here is the gist of how it works: user solves a one-time traditional CAPTCHA.
A browser extension creates a bunch of "tokens", cryptographically blinds
them, and asks the server to sign them. Each token can be used only 1 time to
prove the user is not a robot: later the user can submit to the server an
unblinded, signed token effectively saying "hey, you previously verified me as
human, here his a signed token you gave me".

The key insight is the use of blind signatures
([https://en.wikipedia.org/wiki/Blind_signature](https://en.wikipedia.org/wiki/Blind_signature))
to provide anonymity: 2 unblinded tokens can't be determined as belonging to
the same user, because the server didn't see what it signed.

It's a neat idea but it's just moving the trade-off more towards user
convenience and away from security: if a server accepts to sign 100 tokens per
CAPTCHA, then solving 1 CAPTCHA now allows a robot to do 100x the work it did
before.

~~~
belorn
> then solving 1 CAPTCHA now allows a robot to do 100x the work it did before

The point of a CAPTCHA is to validate that the person browsing a website is a
human being. Once they have done that, the person can click on 1 page, 100x
pages, or 1000x pages on the site, and from a security perspective that is the
same result. The goal was to identify the nature of the entity behind the
request, not limit how many times the person accessed the site.

If a server accepts to sign 100 tokens per CAPTCHA, it means that the server
assumes those 100 tokens represent the same human person. It do not allow a
robot to do 100x more work than before unless the captcha is so faulty that
the assumption is incorrect, in which case the captcha is broken and should be
replaced. A captcha that only work 90% of the times is not actually preventing
bots, since bot owners can easily just add 10x more bots.

~~~
mrb
Let me reexplain the threat model that I see.

Consuming one token bypasses the need to solve one CAPTCHA. Now you have heard
about people being hired to solve CAPTCHAs, right? Well with these tokens,
every time a human solves one CAPTCHA and obtains 100 tokens, it's as if he
helped a robot bypass 100 CAPTCHAs.

Other example: say a CAPTCHA system is so good that only 0.1% of its
challenges can be solved by a robot. With these tokens, if 100 are signed per
solved CAPTCHA, then it's as if the robot could solve 10% of them. A 100x
improvement.

------
mikegerwitz
It uses a blind signature protocol allowing the client to generate bypass
tokens without future correlation. That's good.

Unfortunately, because it requires that the user use a plugin, this creates
two groups of Tor users: those that are using this protocol and those that
aren't. This is more information that can be used---with other information---
to aid in de-anonymizing users. (To be clear: using ephemeral JavaScript, as
they mentioned, is not a credible option, so they have chosen the better route
here.)

CloudFlare stores cookies today, yes, but they can be ephemeral with good
client cookie policies. A browser plugin usually persists sessions---even if
the tokens don't, the fact that it is _installed_ does.

I understand that this is the case for other plugins as well.

In any case, CloudFlare criticism aside: I'm glad that CloudFlare is listening
to the Tor community, and has come up with a protocol that does its best to
respect users' privacy.

~~~
Taek
The Tor Browser Bundle is pretty persistent about updates. If you're a version
behind, it lets you know frequently, with flashy annoying notices. Being a
version behind often has security implications, and users hiding behind Tor
are often very dependent on being secure, so it makes sense.

That also means that if TBB were to be extended with a new plugin, it would
get to every user very quickly. Especially if they did some sort of time delay
(probably overkill), where the browser updated 2 weeks before the update
actually kicked in. Then everyone who has upgraded in the past 2 weeks
instantly gets group-anonymity, and everyone who hasn't upgraded has only
themselves to blame because the browser gives you a nice flashy warning
immediately when you open it up.

I hope that the code is audited for back doors by multiple independent
parties, but other than that I think this is fantastic.

~~~
mikegerwitz
> The Tor Browser Bundle is pretty persistent about updates.

That's assuming that the Tor Browser Bundle (and Tails) will include it. I'm
curious what they will decide.

------
jakobegger
Isn't the bigger issue that Cloudflare's default 'protection' setting is too
eager?

A few years back back I used cloud flare on my company website in an attempt
to improve speed. When I browsed from public Wifi, Cloud flare showed
challenges for static sites; this seemed pretty pointless. What kind of attack
is cloudflare preventing when they block people from accesing static sites?

~~~
tux3
Since the CAPTCHAs trigger based on IP reputation, I assume they're trying to
prevent things like automated forum spam, by default for everyone. I don't
know if there's a setting to avoid challenges even from shady IPs unless the
site is under attack, that sounds like it could apply to a lot of websites.

~~~
mSparks
While obviously useful for the poor souls that live in backwards
dictatorships/monarchies like north korea and the uk. i dont personally think
tor exit nodes are that great an idea.

tor traffic should stay on the tor network.

~~~
roywiggins
I was under the impression allowing people to access the internet at large is
explicitly part of Tor's design. If you want something more inward-facing, I2P
seems to lean in that direction. I2P's comparison with Tor suggests this too:

[https://geti2p.net/en/comparison/tor](https://geti2p.net/en/comparison/tor)

Tor: "Designed and optimized for exit traffic, with a large number of exit
nodes"

I2P: "Designed and optimized for hidden services, which are much faster than
in Tor"

~~~
mSparks
yes, but my point was I think its a "bad" design.

It is not part of the internet, if servers want to provide "anonymous" access
they can provide a tor address. But if they don't, then accessing a standard
webserver via tor is very little different than any of the other means of
illegitimately accessing computers.

->web access via tor seems like a lot of effort to provide what is little more than providing "hacking" services against normal webservers.

I would much rather see for example:
[https://thepiratebay.se.onion](https://thepiratebay.se.onion)

where tor contacts the dns for thepiratebay.se to get the underlying onion
address to use.

much better all round than either going through an exit node (most of which
are malicious) to thepiratebay.se or trying to remember uj3wazyk5u4hnvtk.onion
or whatever it has changed to now.

------
comex
From the spec:

    
    
         The scheme requires the server to detect nonce reuse with reasonable 
        reliability. However, there might be no need for a zero false positive rate, 
        because if an attacker needs to make 10,000 requests to have one succeed, 
        that's possibly an acceptable trade-off.
    
        Therefore, the server could use data structures such as Bloom filters or 
        cuckoo filters to store tokens that it has witnessed. The parameters of 
        these structures can be chosen to ensure a false-positive probability of any 
        given amount. Cuckoo filters may be more efficient but Bloom filters may be 
        easier to construct.
    

I don't think this makes sense. "False positive" for a Bloom filter means it
thinks an item was previously inserted when it wasn't really. If the filter
represents a set of used nonces, the result of seeing an item as previously
inserted would have to be blocking the request as a duplicate: therefore a
false positive would cause a fraction of legitimate requests to be blocked,
not malicious requests to be allowed as the first paragraph seems to imply.

This result wouldn't necessarily be unacceptable either, especially if there
was some mechanism for the browser to automatically retry a HTTP request with
a new token if it received a "reused token" error. However, this behavior
would have to be specified, and it's somewhat tricky: it's also possible for
tokens to be (actually) reused by accident, e.g. if the user restores their
system from a backup or a VM snapshot. In that case, it would make more sense
for the browser to respond to a duplicate token error by throwing away all its
tokens, since it would have no way to know which of them were clean.

Then again, if the Bloom filter is big enough that the probability of a false
positive is very low, even spuriously forcing the user to complete another
CAPTCHA may not be the end of the world.

------
alz
So, most captchas don't seem to pose much of a barrier to bots that need to
get round them.

A couple of months ago there was an article on the state of web scraping in
2016: [https://goo.gl/eUtkRA](https://goo.gl/eUtkRA). In it, the author easily
identified and integrated one of many captcha solvers.

Worst case scenario, there is also crowdsourced mechanical turk style captcha
solving as a service: e.g. [https://anti-captcha.com](https://anti-
captcha.com).

I guess this raises the question as to whether captchas pose more of a barrier
to users than bots, and whether they should be used at all?

~~~
mike_hearn
The big networks don't use CAPTCHAs in the original sense they were meant to
be used anymore. They all moved on to phone verification many years ago. I was
a part of some of the discussions in Google on this issue - should we keep
using CAPTCHAs at all given that they'd been basically replaced by better
systems?

The answer was yes. CAPTCHAs are still present because they act as a throttle.
The point of a strong CAPTCHA is to limit the amount of abuse that can get
through if the other mechanisms break down, by exploiting the fact that humans
are kind of slow. Even though OCR can handle most CAPTCHAs these days, it's
still not 100% effective, so by ramping up the number of CAPTCHAs you ask
users to solve you can still put a throttle on activity. In this way it acts
as a last line of defence.

That's why I'm not sure this is going to work out. CAPTCHAs are not a way to
distinguish good users from bad, which is how CloudFlare is trying to use them
here. CAPTCHAs are way to slow down and throttle traffic that might be auto-
generated when you _can 't_ tell if it's good or bad. Building a new way to
show you solved a CAPTCHA previously doesn't help if the reason you're being
shown CAPTCHAs is specifically to slow you down regardless of whether you're
good or bad.

------
slester
Couldn't CloudFlare track a Tor user by tracking the tokens it gave to a
particular user, then track them when their client used one of those tokens to
validate?

~~~
Ar-Curunir
I think that's what the "blind" portion of token authentication is for.

~~~
mirimir
How would users know that tokens had actually been signed blindly?

~~~
yorwba
If the tokens are never sent, only their blinded versions, it is pretty much
guaranteed that the signature you get back was made without looking at the
actual token.

~~~
mirimir
I get that. What I wonder is who would nontechnical users need to trust about
that? CloudFlare? The Tor Project?

~~~
Ar-Curunir
I'm not sure, but it can be done with just CloudFlare changes; if the plugin
is open source it should be fine. Maybe if Tor Browser integrates the plugin
it should be fine too.

~~~
mirimir
Optimal would be only needing to trust the Tor Project.

------
jimktrains2
I had a similar idea for transit cards a while ago.
[http://jimkeener.com/posts/public-transit-and-ring-
signature...](http://jimkeener.com/posts/public-transit-and-ring-signatures)
Right now, you can probably narrow someone's home and work pretty well.

------
meowface
I see how this would still be enough to stop DDoS attacks or other high-volume
automated activity like web spidering, but now spammers can save way more on
captcha farms by solving just 1 captcha per N spambot submissions.

------
westdakota
Cloudflare has shitty tech that doesn't work, now everybody needs to play
along. Booooooo.

------
hellofunk
This affects other VPNs as well, not just Tor.

