I think it doesn't exactly reflect how paths are chosen based on relay bandwidth scores, if we compare to the actual path selection algorithm: http://tor.stackexchange.com/a/114
I might be missing something, but it seems that relay same-family and same-/16-subnet exclusions are ignored. This might bias the visualization to increase the apparent traffic between popular nodes, while in reality, the traffic should be slightly more evened out with less popular nodes. Hard to tell if this effect makes any visible difference without analyzing the data, though. Either way, because of the way the code is structured, it shouldn't be too hard to fix: just simulate full paths instead of single connections between nodes.
I think they could improve they ip2address tool, because I would expect some bright spot at hetzner datacenter and it's not there (while whois for IPs of my servers there returns proper location of the datacenter).
Nevertheless, awesome visualization.
There are less than 50 relays in Australia and none of them come close to IPredator. Every AU relay combined has a middle probability of 0.0246% and an exit probability of 0.0060%.
IPredator alone has an exit probability of 2.1961%.
(You can see these stats with Tor Compass: https://compass.torproject.org/)
OVH and Free Telecom probably host a huge amount of Tor traffic in FR. Easily do 300 Mbps 24x7 for sub-$15/m dedicated server.
OVH also has subsidiaries in other EU countries that will geolocate back to those countries (hosted in FR physically).
ovh - https://www.kimsufi.com/us/en/index.xml
Although my uptime isn't exactly spotless, I think I'm getting just what I paid for, and I'd gladly recommend them (if and when they start accepting new customers again).
Here's a link to bookmark; https://console.online.net/en/order/server_limited
Occasionally there will be deals like 2xSSD HWraid 8-16 core 32GB/64GB ram for <$50/m there (if it's blank there are no promos).
And here's the code that does the traffic estimation based on node bandwidth scores (edit: hmm, seems that code might be slightly inaccurate, added a top level comment to point this out...):
Server providers register as autonomous systems, and purchase IP space in large blocks. They often have servers at multiple data centers, with VLAN routing configured to switch packets at the ingress IP to whichever server that IP is assigned to. When a client rents a server from a provider, the provider assigns the client some number of IP addresses from its available pool. Many times, the provider does not actually SWIP (officially delegate via ARIN) these IP addresses to the client, so the registration with ARIN will not reflect the owner of the server an IP currently points to.
tl;dr When a packet goes to an IP belonging to an AS registered with a certain geolocation, the AS can switch that packet to wherever it wants.
Unnamed Rd, Charleston, SC 29492, USA
VPNs are exactly the same. They're encrypted to your provider and cleartext from there. Except the VPN provider knows exactly who you are because they see the IP you're connecting to, in addition to the content. A tor exit node only sees the content, but does not know the source.
No matter how you're connecting, you need to ensure you are running encrypted protocols (SSH, https, ...) to protect against whoever relays your traffic. Tor, VPNs etc do not change this.
Edit: Another potential concern that I hadn't thought of might be pavki's concern for traffic correlation attacks, but that's entirely speculation on my part since they haven't provided any clarification to their original comment.