Work on a client to try and mitigate the risk of timing attacks:
The University simply looked their their logs to see who was connecting to known Tor nodes, narrowed it down by time and found the kid.
These discussions are definitely post-mortem analysis but there is also discussion on another OpSec presentation:
Protecting livestock, so they can be fleeced by their proper owner, on the otherhand, doesn't imbue the same sort of charisma upon the good shepherd.
I see your point, don't worry, it would just be a lot more rough than you're implying, particularly to upload many heavy PDFs. (I kind of want to lab it now that we've discussed it.)
Turns out Cloudflare deems 5 seconds latency too much. I thought most default timeouts were something like 30 seconds, and when writing applications myself I usually limit them to 8 or 10 seconds (to be able to get back to the user quick enough with an "unable to connect" error, but to also give it a moment). I expected that 5 seconds latency would be slow, but not unbearable. Instead it breaks stuff completely.
From my testing, 3.5 seconds latency works fine. Slow, but it consistently works.
Adding 5 seconds latency just breaks TLS connections to Cloudflare, though DNS, TCP and HTTP work. I was able to retrieve a webpage (via netcat) from my site, the redirect to HTTPS from http://news.ycombinator.com worked, and pinging showed a consistent 5030ms +/- 10ms latency.
So VPN-TOR-VPN might work still, though it is indeed more on the edge than I expected. Thanks for making me venture here, I learned something!
Relatively, sure. But I've found it very usable in recent times actually. Used it almost full-time (besides a normal Firefox instance for the company's intranet) to get around some silly firewall that wouldn't let me download "hack tools" (I was an intern in the cyber security department, security tools were part of my job). There were times where I didn't notice at all that I was using Tor, and most of the time it was comparable to mediocre wifi.
And as for monitoring, I guess it might be possible, but if someone thinks to use bridge nodes that's also defeated.
I discovered he moved to Github pages after years only to realize he has not published much since I first heard of him.
Maybe they would have found other evidence. But it wouldn't have been just him connecting to Tor.
I could make a website that adds random(1, 64) one pixel images to each page. As you browse the site, you'll be broadcasting 6 bits of identifier with every click.
(You imply this in your point, but given the specificity of the accusation, I think it's worth clearly pointing out.)
Very few small-ish entities have such a large reach and can interject themselves into so many connections on the web.
So, say I use Tor to make political comments on a foreign website; one that I have reasonable trust is outside the reach of my government. But, say that website uses CloudFlare and CloudFlare has servers that are within reach of my government. That's the difference. It is a difference of degree, rather than kind, but a difference nonetheless.
Powerful actors have always had some ability to compromise Tor by compromising the requesting side (the ISP of the target of an investigation, for example), and then the receiving side (the website where your suspect does the thing you're investigating them for...possibly a honey pot setup specifically to catch people who do this thing, or possibly a website whose owner has already been arrested, prosecuted and made a deal that allowed access to the systems). CloudFlare just adds an additional element of uncertainty for Tor users: Will this CAPTCHA take place in a way and place that allows someone to narrow down my identity?
As with a lot of the security concerns about Tor, one has to take it as weights on a scale. Who are my attackers, and what level of attack can they bring against my traffic? If your privacy concerns don't include state level actors, then this is probably a theoretical attack. If your attackers do include state level actors, then it is a concern to be aware of. State level actors have other means of compromising your identity and traffic, of course, but this is one of them, and if I understand it correctly, it is a valid concern.
Encryption does not inherently obscure the size of plaintext. Protocols may choose to pad plaintext for various reasons, and both Tor (since Tor always sends fixed-width cells) and TLS (when it uses a block cipher mode) do so. However, the amount of padding is typically small and can hardly be said to "obscure" the size of a request - it is not a defense against traffic analysis.
you send 1px of data or 500px tor always send fixed width cells. There is no question of padding here.
The claim is that an adversary who can measure traffic on CloudFlare's side (i.e. you) and the users ISP (i.e. your hypothetical friend mallory) can collude by measuring and comparing the bursts of packets generated during puzzle solving on the ISP side and the receipt of said packets on CF's side.
This information is enough to figure out that Alice wanted to reach example.com via TOR.
So I disagree that there's detail here. Need real technical detail to be able to take action.
If this were a paper or PoC then would be different.
If there's a way to do that then please report it to us.
Tor users on google fiber take note.
This is why I got quite scared when I first heard of Google Fiber.
It's in Google's interest to provide good, fast and cheap service: they will gain more customers and more people will be able to use more Internet services (many of which are from Google or use Google -- adwords, analytics, etc.). Thus they provide speeds for prices that are very hard to compete with for normal ISPs, since normal ISPs don't have the luxury of being the world's most popular, well, so many things (search engine, mapping service, email service, ad service, etc.).
If one company knows everything about you and controls a big enough stake in your life, that sounds very scary to me. Not because Google is bad, but because it's one company able to control many basic services.
I'm almost positive these claims are completely false, for example:
> Cloudflare can conveniently serve few more images to specific users
> Each click on one of the images in the puzzle generates a total of about 50 packets between Tor user's computer and the Cloudflare's server (about half are requests and half are real-time responses from the server.)
> The packet group has predictable sizes and patterns, so all the adversary has to do is note the easily detectable signature of the "image click" event, and correlate it with the same on the Cloudflare side.
There is no API documentation in the reCAPTCHA widget about your server having to handle real-time requests from users solving the widget or serve images, so there is no Cloudflare side. It wouldn't make sense from an API perspective; why would I have to add a bunch of code to my server to handle this stuff? Google runs that. Look here:
Do you see a "handle real time image click events" API here for Cloudflare to deploy? You do not. Google would have to build backends for their machine learning and fraud detection algorithms in every language an API user would ever run, and then they also lose obscurity by shipping them. The image click events almost certainly go only to Google, never Cloudflare, so I think whoever sent this tip didn't understand what they were looking at in Wireshark.
The possible threat vector here is Google, not Cloudflare. Cloudflare just happens to have deployed Google's reCAPTCHA widely. The article is misleading and incredibly light on important detail; how about even a screenshot of a packet capture showing traffic to Cloudflare? If you want my honest take, I read this as a Tor user annoyed they have to solve reCAPTCHAs on Cloudflare sites (the "insistence" and quoted "protects" bits are the clue) and looking for something to hit them with, and a lack of diligence on Cryptome's part before posting it.
https://www.gstatic.com/recaptcha/api2/r20160712125018/recap... is the current version of the widget if anybody is curious, but I haven't looked closely.
Actually, watching the entrance and exit nodes in this fashion is probably more expensive than simply hosting your own entrance and exit nodes. It would be within the NSA's power to, say, host or monitor 500 of the 1000ish exit nodes by now, collecting 50% of the exit traffic at almost no real cost. Entrance traffic is harder as the network is larger, but if you hosted (or, again, captured the traffic to) another 2k non-exit relays you might be able to capture 10-20% of the entrance traffic. The basic points I'm making here are: (1) that there are way fewer relay nodes to monitor than there are ISPs, if you would prefer surveillance; and (2) you are not restricted to surveillance or even to your own nation--there's literally nothing stopping the NSA from purchasing VPSes in the Netherlands and Germany and Sweden and running Tor on them, and it'll seem like a very geographically diverse set when you're looking at it with Vidalia.
Combined together the NSA can maybe deanonymize about 5-10% of the Tor traffic to the Internet right now with a much cheaper method, and this is where it gets interesting: the Tor default is to have 3 hops, which means that in addition to correlating traffic patterns you get to correlate on the IP address of the hop in the middle, even if that hop is not colluding with you. So even in the face of network jitter you have a 32-bit identifier which links together packets above and beyond simple network traffic into or out of Tor. And you only need to operate a few thousand computers to do it -- far fewer than you'd need to monitor the US ISPs in general.
You can also try to watch specific popular exits like Cloudflare, but doing this removes this awesome IP address that you get for the middle hop, and you still need either a relay node or else to be tapping a given user's IP, to try to deanonymize them.
While TOR may be useful for evading firewalls, my general perception of the project has changed from general anonymity tool to a tool tailored for very specific use.
Granted, this is probably what my understanding always should have been.
edit: The major challenge would probably be continually violating various hosting companies' TOS/AUPs and getting service shut off, which would be a continual churn of provisioning new physical servers, shipping them to locations, arranging for plausibly deniable billing, etc.
The difficulties wouldn't be as much as it seems. They probably wouldn't even be shut down that often. Just a small number of high-bandwidth nodes from front companies would net them a lot of intel. They could also partner with Five Eyes and Euro agencies as they all seem to want to de-anonymize Tor users. Each could have fronts doing it with their own operational techniques to muddy the situation up. Again, probably already do in a small way.
We haven't even discussed QUANTUM-ing the Tor servers. They really, really need memory-safe machines & implementations from CPU up if they're expecting to withstand high-strength attackers. Haven't looked at code or supported OS's in a while but I'm guessing default implementation doesn't fit that bill. ;)
As far as I'm aware, control of entry & exit is required for these sorts of attacks.
Running the entry node with a consistent entry point and using it as a random walk crawler with a real browser would seem to be enough for personal use as long as you aren't a criminal worth active, serious investigation that is targeted to reveal you.
The only "new" thing here was the rough traffic pattern analysis of CF captcha page.
If it's this easy for a side effect of a recapcha image to de-anonymize a Tor user, then this seems like a failing of the Tor protocol that they should fix. Maybe they need to introduce more jitter, repackage requests into a single stream with consistent (or randomized) packet size, or pad the packets with random data.
Still, if you browse the web via Tor you will see reCAPTCHAs everywhere because of Cloudflare. And that IS Cloudflare's fault, and its really annoying which is propably why this article is phrased this way.
I mean, using a VPN is slower than not, but tons of people use them
Having to wait a few seconds for a web page to load seems like a small price to pay to avoid government agents banging on your door because you're looking at "subversive content".
Not always. High-latency mix networks with fixed message size and randomized transmission are very robust against traffic analysis.
The elephant in the room is, as usual, PEBKAC. The demand by users for low latency will always be the killer for anonymity networks.
In this case, it would have helped a bit, since an attacker would not have seen the characteristic staccato of the reCAPTCHA exchange. They would have seen a few kB in either direction, in 40-100 packets, over a period of a few seconds. If the implementation is clever, on end would even have a different signature than the other.
At least this is something I would have included in Tor. Now that I think about it, randomly introduced delays (from the outside) might actually be a technique to deanonymize users....
([x] You'd generate packet sizes and minimum transmission times from a known seed. First packet is 501 B, 24 ms later a packet of 2048 B, then 15 ms later one of 1718 B, and so on. If there is not enough data after a grace period, pad with junk. If you constantly need more time to send packets than allowed, or need to pad, then adjust the model. Also choose the model to match regular traffic if possible. Disclaimer: I'm just making this up on the spot and am no expert, but it seems plausible and obvious to me.)
It's really sad. So much brain power and this is what they come up with.
Apologies for the rant, couldn't help it. ReCAPTCHA is one of very few things I genuinely hate.
Have any examples?
Obviously it would be a very bandwidth hungry network, though if exit node bandwidth is currently the limiting factor (is it?) then maybe not entirely impractical.
I stand by Cloudflare. So much malicious traffic comes through Tor that administrators need to do a lot to protect themselves from it.
Sure, this could be attacked - but not at scale, and that's the whole point of the capchta anyway, right?
CloudFlare does have a JS-only challenge, which presumably does this type of thing, but this has a couple different problems. From a security perspective, you're executing arbitrary software, which is unwise, especially if you're looking for anonymity. The other issue is that the software is also proprietary.
Related: I have started trying to get into contact with webmasters of sites that enable JS Challenges; my template is at the bottom of this page; it'd be great if others could do the same:
Considering that they allow it without JS enabled, I wonder why they'd need any requests at all.
They have a non-JS version, too.
Ok, so you could differentiate the work for User-Agent, but then one could spoof user-agent to get less work. Going this route is not gonna be simple.
It's fine for my purposes, because I have no dedicated adversaries to combat, just occasional small-time spammers. But CloudFlare has a tougher time.
Why does recaptcha have a distinct signature and if it does couldn't an attacker just make a distinct signature without recaptcha?
And why does recaptcha have a traffic signature that can distinguish between users?
I mean how does a simple request response create a distinct traffic?
Or, my favorite, the binary search (assuming you control the server / the network in front of the server / some exit nodes, and can monitor the traffic of your targeted user): have sites that cause transmissions for some time (long running JS / requests, or just a lot of content the user interacts with). Freeze 50% of the servers connections. Is the user still connecting? Then s/he is in the 50%. If not, in the other. Repeat until the is user matched to activity on the server.
I'm not too sure about what defines the "distinctive" signature of recaptcha, but I expect timing of when packets of certain sizes were sent (e.g. loading 9 images in quick succession will generate larger packets, and then you often get the incremental couple images with human-length delays between them).
Well, I'm not sure I'd go that far.
We do not need any more evidence, there is enough out there about gag orders, secret courts, worldwide compromise of network security.
USA tec company inhabitants and founders, read this: please move out of the country, build your companies in other places, do it now. There is no time to waste. You can not repair the system, that corrupt bureaucrats have irreversibly destroyed.
It will take one or two generations to rebuild a freedom oriented democracy in some other place. Currently Europe still seems to be a good starting point, especially now that the main USA influence channel GB is out.
Please give up the false hope and act now. Get out of that failed state! Freedom can not be rebuild in a fascist system without help from the outside - you can do help much better from outside!
People who still stay in USA will be seen as cooperators by history, the window of opportunity is closing, hurry on and get out asap. Help to defend freedom in other places!