
The Pirate Bay Runs on 21 “Raid-Proof” Virtual Machines - tjaerv
http://torrentfreak.com/the-pirate-bay-runs-on-21-raid-proof-virtual-machines-140921/
======
fooyc
" This saved costs, guaranteed better uptime, and made the site more portable
and thus harder to take down "

Probably not true for " This saved costs ". From what i've seen, virtual
machines usually cost more than twice the price of renting the equivalent
"real" machine monthly.

They could have used dedicated servers; there are more dedicated server
providers than VM providers, thus achieving the same goal, less expensively.

Probably not true for " better uptime " either; VMs are still hosted on real
hardware, which fails, too. (Although distributing the work on more
independent machines can improve uptime.)

~~~
drsintoma
They are more expensive, but they are usually easy and immediate to acquire.
Which makes provisioning much more efficient in case of fluctuating traffic.
And overall sysadmins will have less tendency to over-provision, meaning
getting more and beefier machines than it's needed "to be safe".

~~~
fooyc
Nothing prevents you from using bare dedicated servers for your usual traffic,
and VMs for anything else.

~~~
drsintoma
In an ideal world yes. Or if your software works already seamlessly cross-
datacenter. But in the real world is rare that your hosting provider is good
at both VMs and metal. At least that's the biggest problem I've always
encountered, specially with budget providers.

------
TomAnthony
If the load balancer is the weak point that would be first to be discovered,
then I imagine they must have some mechanism to stop it leaving evidence that
leads to the other machines if it were to get raided (it isn't on their
hardware, so they can't prevent the files being backed up).

Is there a way the codebase could be entirely encrypted and not even
accessible to the cloud provider (with some 'boot password' needed each time
the server starts up)?

~~~
waxjar
I remember reading their load-balancer shuts itself off after 2 minutes if
something strange happens. The whole thing is run from memory, so no traces
will be left.

I don't know how accurate this is, though.

~~~
tacotime
I wonder how long until our nation's SWAT teams are equipped with ram rippers
and cans of liquid nitrogen.

~~~
dlgeek
No need to get that fancy. If they're in the cloud, they're most likely on
Xen, which natively supports full memory dumps. LE goes to the cloud provider
with a warrant, they run "sudo xm dump-core", LE walks away with a memory dump
without the client ever knowing anything.

~~~
dwild
Even if they are able to do that, they still need to shutdown 21 other VM,
probably on 21 other hosting provider in different countries, all that in the
same time. I'm sure they can but I'm also pretty sure TPB can start new ones
faster then authorities can stop them.

------
tjaerv
"At the time of writing the site uses 21 virtual machines (VMs) hosted at
different providers. [...] All virtual machines are hosted with commercial
cloud hosting providers, who have no clue that The Pirate Bay is among their
customers."

~~~
crazy1van
They may "have no clue" but it seems like that's only because they don't care
and haven't looked. I don't see anything in the article that would prevent the
providers from figuring this out unless I'm missing something.

~~~
allegory
If someone is paying the bill, do they really care?

~~~
larrys
"If someone is paying the bill, do they really care?"

So you could cross reference names of the people raided with payment
information of the VPS providers (usual suspects or top "n" providers let's
say). Of course that could be hidden as well.

Other issue is how does anyone know this isn't _misinformation anyway_ and
that the VPS providers don't play a role or not as much of a role as is
indicated. Just because someone is writing this or because they said it?

What advantage does it have for anyone (like this) to reveal anything about
how they are situated security wise if not to lead people off the beaten track
even given some possible marketing advantage?

~~~
nacs
> payment information of the VPS providers

I doubt this will be useful as they're probably using Bitcoin or prepaid cards
or things like that for payments.

> What advantage does it have for anyone (like this) to reveal anything

It could teach others how to setup websites that are harder to censor or more
resilient to raids (plus it gets them free PR/traffic).

------
verroq
Why can't people find where their servers? I understand they have their own IP
allocation, thus they can use BGP tricks. But don't they need a sympathetic
ISP or similar to help them get the routes in?

~~~
nkcmr
IIRC They have their load balancer hosted under a sovereign IP address (the IP
block belongs to a political party). So attempting to mess with it could
constitute infringement of free speech.

------
Nanzikambe
Interesting, so I'm presuming there's several VPNs involved between the load-
balancer and all the discrete servers. I wonder if they use a VPN provider
with a static IP and no-logs policy or if it's simply yet another VPS.

I'd love to hear a little more about the architecture.

~~~
icedchai
Why would they use a VPN provider? It's trivial to set up your own if you
control the systems at both ends.

~~~
TheLoneWolfling
Assuming at least one side isn't under a NAT, true. Otherwise it gets...
"fun".

~~~
the_mitsuhiko
A NAT does not stop you if you do it right. Console games have been hosting
servers through NATs for the last decade.

~~~
TheLoneWolfling
As I said: it gets "fun", especially if you want to do it without a third
server to set up the connection. Still doable, just "fun".

~~~
tacotime
I would speculate that the massive advertising revenue generated by the 'bay
probably renders any such 'fun' factor trivial.

------
Theodores
> In total the VMs use 182 GB of RAM and 94 CPU cores. The total storage
> capacity is 620 GB, but that’s not all used.

That level of hardware/cores seems a bit over the top given what TPB does.

When I was a boy we had this thing called 'Alta Vista'. It was _the_ search
engine before Bing! came along. Processors did not run at gigahertz speeds
back then and a large disk was 2Gb. Nonetheless most offices had the internet
and when people went searching 'Alta Vista' was the first port of call for
many.

TPB has an index of a selective part of the internets, i.e. movies, software,
music, that sort of thing. Meanwhile, back in the 1990's, AltaVista indexed
everything, as in the entire known internets, with everything stored away in
less than the 620Gb used by TPB for their collection of 'stolen' material.

From
[http://en.wikipedia.org/wiki/AltaVista](http://en.wikipedia.org/wiki/AltaVista)

Alta Vista is a very large project, requiring the cooperation of at least 5
servers, configured for searching huge indices and handling a huge Internet
traffic load. The initial hardware configuration for Alta Vista is as follows:

Alta Vista -- AlphaStation 250 4/266 4 GB disk 196 MB memory Primary web
server for gotcha.com Queries directed to WebIndexer or NewsIndexer

NewsServer -- AlphaStation 400 4/233 24 GB of RAID disks 160 MB memory News
spool from which news index is generated Serves articles (via http) to those
without news server

NewsIndexer -- AlphaStation 250 4/266 13 GB disk 196 MB memory Builds news
index using articles from NewsServer Answers news index queries from Alta
Vista

Spider -- DEC 3000 Model 900 (replacement for Model 500) 30 GB of RAID disk
1GB memory Collects pages from the web for WebIndexer

WebIndexer -- Alpha Server 8400 5/300 210 GB RAID disk (expandable) 4 GB
memory (expandable) 4 processors (expandable) Builds the web index using pages
sent by Spider. Answers web index queries from Alta Vista

~~~
lmm
How many pages were there on "the entire internet" back then? How many
reqs/second did AltaVista serve? How does that compare to the numbers for TPB?

~~~
Theodores
From what I remember the whole of TPB server + data could fit onto a 90Mb usb
stick in 2012. Sure we have had many episodes of really important reality TV
series and other great stuff that all needs pirating, yet, in 2014 I doubt
that 90Mb has ballooned into peta bytes. We are still in the same range -
let's say 1Gb might be a reasonable size USB stick to buy for it.

Alta Vista started out with a modest size index of 20 million pages. Let's
imagine those pages were all of 1Kb in size, then, 20 _10^6_ 10^3 comes to 20
*10^9 or 20Gb. So, in terms of stuff indexed, that is considerably larger than
TPB. Agreed?

Well, maybe not. They could have used compression to get the vastness of TPB
onto that USB stick. Around that time - 2012 - they had 1.6 million torrents.
That is some way off the Gb that AltaVista indexed, no matter how you bloat
the maths. Sad to say, but, in the 1990's, the internet was actually larger
than your porn collection.

How useful is reqs/second anyway? By that score Google probably does very
badly as a search usually returns the answer on the first page. With old-style
search engines you might need to go through scores of pages before getting
what you want. I found TPB to be a bit like that too, wading through results
pages more than necessary.

TPB is not 'safe for work' and in a lot of jurisdictions you cannot even
access it from home. In the UK (which is a small but well populated country)
it is not that easy to get onto TPB - you have to have hacker voodoo skills to
do that or route through a VPN as none of the main ISPs will let you on. Most
of the civilised world has the same need to protect citizens from the evils of
TPB so places where it can be accessed are not that common. Even if you could
access it, would you? Probably...

Meanwhile, back in 1998 - a year or two before the dotcom crash - plenty of
people were using search engines such as AltaVista (which was the best back
then) for actual work. Maybe not everyone, but enough people knew about
computers and things like AOL disks, modems and what not. The internet was
big.

Which reminds me of my main point, the one you thought so important to down
vote rather than give kudos for being insightful. TPB uses a constellation of
computers and consumes vastly more resources than the biggest search engine of
the 1990's, yet, the utility of TPB is limited to only a few fortunate enough
to live somewhere where TPB can be accessed. What can be searched for on TPB
is a mere subset of what was on Altavista albeit different and not so useful
stuff. I would say that with AltaVista they were doing far more with what they
had, reaching a better audience, doing something more useful for the world
(than serving weight loss adverts) and all together performing a miracle. TPB
is a slouch in comparison.

~~~
pfg
I'm not sure what your point is. AltaVista probably had to put a lot of effort
into tuning every part of their infrastructure to keep the site running on
that hardware. Why would TPB do that when they can simply get another VM for a
fraction of the cost?

Running a top 100 site[1] on 21 VMs in 2014 is quite impressive.

[1]:
[http://www.alexa.com/siteinfo/thepiratebay.se](http://www.alexa.com/siteinfo/thepiratebay.se)

