
The FBI stole an Instapaper server in an unrelated raid - garethr
http://blog.instapaper.com/post/6830514157
======
Xk
Instapaper stores only salted SHA-1 hashes of passwords, so those are
relatively safe.

\--

Obligatory statement on _NEVER USING SHA-1 HASHES_ to make passwords "safe".

Any normal person can brute force millions of SHA-1 hashes (salted however
much you want) per second on a GPU.

If the FBI so wanted (although I don't believe they do) I'm sure they could
brute force almost every single password in that database. Granted, it's the
government and they have better ways of obtaining such information, but if
there is someone the FBI is watching on Instapaper's databases and they so
wanted, storing the SHA-1 hash of the password all but handed them over to the
FBI.

I am now glad my Instapaper password was generated randomly, 16 characters
long, and I will now change it just to be safe.

For anyone running a database which stores ussername/passwords, take a look at
bcrypt or scrypt. They're millions (no, I am not exaggerating) of time better
than SHA-1.

(Edit: Grammar)

~~~
joevandyk
So if I'm using SHA-1 already to store passwords, what are my options for
moving to a different system? I assume there's no way to rehash the passwords?

~~~
Xk
You have two choices that I see:

1\. The next time a user logs in to your system and you verify against the
SHA-1 hash that they are who they say they are, recompute the correct hash for
bcrypt. Then, _delete the SHA-1 hash_. It does you no good to have a bcrypt
version if you keep the SHA-1 version around.

2\. Generate the bcrypt hash from the SHA-1 hash. That is, pretend that the
SHA-1 hash is the user's password. This isn't as clean (your password
authentication software will then have to do SHA-1 followed by bcrypt) but it
means you'll be able to migrate your entire database all at once if you so
choose. This also causes a very (very, very) slightly higher chance of
password collisions, although there's not much to worry about from that.

~~~
StavrosK
Or you can do both, migrate everything to bcrypted SHA for now and then
replace these with straight up bcrypt the next time the user logs in.

You ostensibly have a hash method identifier per hash, so just create "sha+bc"
and "bc" along with your current "sha".

Also, why is the risk of a collision greater? It seems to me that, if
anything, it should be lower, as SHA hashes always consist of a fixed number
of bits, and thus aren't as likely to collide when hashed to bcrypt, assuming
the length of a bcrypt hash is the same (or larger) number than a SHA one.

Basically, it seems to me that, if they're going to collide, they're more
likely to collide at the SHA level, which is a problem either way.

~~~
Xk
Imagine you have two hash functions F and G, both mapping from the domain of
integers to integers mod 2^128. Imagine they are perfect in that if you hash
all the integers up to some large N, each hash is expected to recorded exactly
the same number of times (probabilistically).

Now clearly if I hash a password F(P) and another password F(Q) there is a 1
in 2^128 chance they collide.

Now imagine we do G(F(P)) and G(F(Q)). We first have the chance of 1 in 2^128
that F(P) == F(Q) which implies G(F(P)) == G(F(Q)). However, that is not all!

We now have a new 1 in 2^128 chance that G(F(P)) == G(F(Q)). So we have
(about) a 1 in 2^127 chance that the passwords will collide.

But, none of this really matters. Collisions aren't what you worry about with
password hashes.

~~~
StavrosK
I see what you mean, and I agree that it doesn't matter, but it's an
interesting exercise anyway.

I disagree that there's a 2^128 chance that they will collide. Trivially, I
can show you a hash that will never collide for up to some N, and that is F(P)
= P mod 2^128. This will never collide unless P is more than 128 bits long.

My rationale, above, was that SHA constrains the space to 128 bits. Therefore,
for different SHA hashes (ones that haven't already collided), the probability
that bcrypt will collide might be _smaller_ (or zero, as in my example above).

In reality it doesn't work like that, I know, but in theory you can't be sure
that the probabilities of collision will add (or, well, multiply) up.

~~~
Xk
Point taken, it's probably true that the probability that G(F(P)) == G(F(Q))
given F(P) != F(Q) is less than 1 in 2^128. But it's probably also true that
it's greater than 0.

Clearly it's impossible to be less than zero. So no matter what you do,
Defining H(X) to be G(F(X)) will have strictly more collisions than F(X).

The reason I would argue it's greater than zero is that if a function H
existed such that H(X) will never collide for X less than 2^128, it would
probably have some cryptographic weakness.

~~~
StavrosK
Hmm, you're right indeed, since they're additive. I should have said that the
second layer is less likely to collide than the first, not than both combined.

------
jsdalton
Surely there is a legal precedent which provides at least some framework for
what can or cannot be seized during a warrant search? This can't be the first
time government agents have mistakenly seized property in an otherwise lawful
search.

Also, while I completely understand Instapaper's unwillingness to pursue this
through the courts, that _is_ the way our legal system is structured. If you
believe you have been harmed in some way by a government action, the courts
are the avenue through which you must obtain recourse.

(Not a lawyer, so if I'm wrong about any of the above please correct me.)

~~~
cheez
It's called the constitution of the United States. If the enforcers don't
follow it, your only recourse is the Supreme Court which will probably throw
out your claim for national security reasons.

~~~
danielsoneg
Not necessarily. I (also) am not a lawyer, but the question isn't whether the
FBI has the authority, constitutional or otherwise, to seize the servers owned
by the target of the warrant, the question is whether they overstepped their
bounds in seizing three whole racks of servers. If it's shown they were
careless or did not take sufficient caution in their raid to avoid seizing
unrelated servers, they could be held liable for damages. In this case in
particular, the number of unrelated companies that have been affected by this
(and the number of servers present in 3 racks) makes a case for negligence.

Again, though, the question isn't whether they had the right to seize the
servers they had warrants for - they did, and you won't get that questioned by
any court - but whether they did so properly, and it's not unheard of for a
law enforcement agency to get slapped for overstepping their bounds. It's not
Common, but it's not unheard of, and it's not a 4th amendment issue either.

~~~
gte910h
It is a 4th amendment issue: That's why they can't just take the server of
people not under investigation...

~~~
danielsoneg
Right, but for the courts the question is, did they intend to violate the 4th
or did they just screw up? Intent matters - in this case, whether they
intended an unwarranted seizure of the server racks or if they just screwed
up. Because they had a valid warrant to seize some servers from that data
center, I think you'd have a really difficult time pressing the case that they
Intended to violate the 4th amendment rights of the other people whose servers
were on those racks. Negligence, on the other hand, is a much, much lower bar,
much easier to prove in these circumstances, and should be adequate to secure
damages - and frankly, I've got a lot easier time believing it was negligence
in this case than an intentional violation of the other individuals' 4th
amendment protections.

~~~
cheez
"Uh sorry, I didn't mean to do that."

"Oh, phew, carry on."

You're kidding me, right?

------
mrcharles
The more I think about it, the more I think this should be treated the same as
any of the other thefts of data information to have happened in the past few
months. Sony, Toyota, Sega, etc. A potentially hostile group now has a ton of
personal info. People should know.

~~~
pavel_lishin
A potentially hostile group that also has much greater resources at its
disposal than LulzSec, and much more ambiguous motivations.

------
nbpoole
So, the FBI has a copy of Instapaper's complete database and a copy of their
website code. The database includes:

\- Salted SHA-1 hashed passwords for Instapaper

\- Encrypted passwords for linked Pinboard accounts (with the encryption key
stored in the website code)

\- OAuth tokens for linked Facebook/Twitter/Tumblr accounts (and presumably
also the secret keys used by Instapaper to use those tokens).

That's (potentially) a lot of personal information.

~~~
imrehg
As a real practical question out of curiosity: how would you design their
system differently so unauthorized people having only your hard drives
couldn't get any data at all?

~~~
pavel_lishin
You could always hash the e-mails, although this would make resetting your
password impossible.

How much data do Facebook's OAuth tokens contain? By looking at one, can you
tell that it's linked to Pavel Lishin's account?

~~~
discover
Would splitting the data in half work?

I mean literally cutting the data sent into two pieces and each piece entering
a different database server in a different country. Then, when requested,
pulling both pieces and sending them to users who patch them together with
client side script...?

~~~
pbh
There has been some academic work on this (CIDR'05), but I'm not sure if it
has been used in practice.

<http://ilpubs.stanford.edu/659/>

~~~
pavel_lishin
Thanks, printed it off, will try to read on the train.

~~~
InnocentB
Sounds like a good use case for Instapaper :P

~~~
protomyth
Might not want the FBI to know he read the article if they come back for a
"followup visit"

------
bestes
I think the OP was unreasonably harsh on DigitalOne (never heard of them let
alone have any interests). It is very possible that they are consumed with FBI
questioning, gag orders or who knows what else. I would give them a pass for a
few days until more detail comes out.

~~~
lamnk
I think so too. He says:

    
    
       I have no idea whether I’ll ever see the server again
    

In this case the host probably doesn't know better than him. According to the
NYTimes they are a swiss company, they only rent space and connectivity from
the data center.

I see people jump up and down accusing their host being a bad host when their
websites go down for 10 minutes. The thing is, shit like this happens all the
time. Some years ago even Rackspace was taken offline because a truck hit
their data center. Bizarre, right? Yes, but it did happen.

~~~
idlewords
The problem with DigitalOne was a complete lack of communication around this
event. It was a long time (and a lot of badgering) before any of us learned
anything about what had happened. I can sympathize with being busy during a
crisis, but total silence for 24+ hours, with no working website, email,
status page, or twitter account, is not acceptable.

~~~
lamnk
Yeah, that is what hosting companies often lack: communication with their
customer during crisis. I totally agree that DigitalOne should inform their
customers about the incident and they handled the case poorly. But, like I
said, put all the blame on them is too harsh.

------
tritchey
"a Swiss hosting company leasing blade servers"

If they are truly blade servers, then they were possibly sharing the same
chassis, power supply and backplane. Could the FBI have pulled just the blades
in question? Possibly. But I can very easily imagine the entire blade chassis
being viewed as a monolithic component that they would want to be able to
perform whatever forensic analysis they are planning. They could also have
pulled whatever blades they were not after, and left them, but until you
replace the chassis, you are dead in the water.

------
smackfu
To be clear, the server stopped responding, and the host he is paying for the
server has not responded at all. The server could simply be unplugged, or all
the network cables were unplugged during the raid. Who knows? I guess "The FBI
stole my server is a better headline" though.

In my experience with our leased data center cages, we are expected to fly in
to town if we ever need to physically manipulate the servers or even plug
things in. The data center employees don't even go into the locked cages.

If the FBI forced open a locked cage, and did stuff in there, I would not
expect anything to be addressed until DigitalOne showed up to fix it.

~~~
protomyth
If DigitalOne's people are out of country, a truly evil tactic for the FBI
would be to ask customs to reject any reps entry.

------
yuvadam
I'm trying to think of an analogy which can explain why this might be
reasonable from the FBIs perspective.

Suppose you were using a shared storage space (shared servers, or server farm)
with several other dudes. One of them is a drug dealer. One day the police/FBI
decide to raid the storage space since the drug dealer has been using it to
store illegal drugs.

Is it not reasonable to consider this collateral damage (which, granted, is
totally unnecessary) during law enforcement operations?

I'm not saying this is OK in any case, but might this not be a reasonable move
by the law enforcement agencies?

~~~
cheald
It is not reasonable if the FBI does not have a warrant for your
servers(/storage space). Instapaper is completely right to call this "theft".

If his servers are included in the warrant because they were suspected of
housing whatever it is the FBI was after, and the court granted the FBI the
right to seize them, then yeah, it's reasonable.

If he was sharing a physical machine with the bad guys, then yeah, sorry,
that's collateral damage. However, if he was on his own separate leased
machine, there is absolutely no reason for the FBI to seize it. It'd be like
them executing a seizure warrant on one of those self-storage spaces, and
seizing the contents of all the adjoining compartments (which the person being
investigated would have had no access to) just because.

~~~
pavel_lishin
Do we know what the warrant stated? If it authorized them to take the rack
containing the server they were after, then this is legal, if unfortunate.

If the police have a warrant for my apartment, and you happen to leave your
backpack and server, your stuff will most likely be confiscated, along with
mine, if it interests the police.

~~~
brown9-2
No, this does not seem to be public knowledge. For all we know the Instapaper
(and pinboard, etc) servers could have been included in the warrant.

~~~
idlewords
There's a FOIA request out for the warrant. I'll be curious to see it.

[http://www.muckrock.com/foi/view/united-states-of-
america/wa...](http://www.muckrock.com/foi/view/united-states-of-
america/warrant-for-fbi-seizure-of-coresite-servers-on-2011-06-21/646/)

------
mrcharles
All the more reason for data havens to exist. Run your server from a country
where the police can't just take it with impunity.

~~~
pavel_lishin
Of course, then you have to keep careful tabs on that country's politics. The
police can't confiscate your data in June, until a new bill is passed in July,
and suddenly it's up for grabs.

Furthermore, I'm not sure I'd want to host my data in a country where the
police cannot pursue digital criminals.

~~~
5l
Like it or not it's still the wild west, and I suspect most people here trust
their own ability to protect themselves more than they trust the sheriff who
only investigates crimes against the mayor and can't even ride a horse or
shoot straight.

------
johngalt
Why isn't Facebook having their servers seized? Google? Amazon? If the FBI is
really targeting the "badguys" I'm sure there have been more badguys using
facebook/gmail/AWS than any single colo.

Why haven't there been similar seizures of any larger corporate entities? Even
if the current FBI practices are valid, should the application of those
practices be a function of size/wealth/power? Which servers of Sony's were
seized after distributing rootkits?

~~~
maw
Good question. No solid answers here, but my guess would be some combination
of more redundancy, better and more active lawyers, and the large players not
talking about it when it does go down.

~~~
epoxyhockey
FB, Google, etc all provide a nice procedure for LEO to query all desired
info. It is not necessary to seize equipment. Example:
[https://www.eff.org/files/filenode/social_network/Facebook20...](https://www.eff.org/files/filenode/social_network/Facebook2010_SN_LEG-
DOJ.PDF) (pdf)

------
justinweiss
Looks like it's back:

<http://twitter.com/instapaper/status/84106275796946944>

"As of 2 minutes ago, my DigitalOne server is back online. The logs indicate
that it was off and not booted during the time it was missing."

~~~
m0nastic
But that would mean that the FBI weren't bumbling morons who salted the earth
after tearing out everything in the datacenter with a power supply...

I'm not sure I can deal with the possibility.

------
Astrohacker
I think it may be prudent to begin encrypting all data on disk that can
reasonably be encrypted while being able to set up the server remotely so that
no one can just snatch your server and get all your data.

This could work by encrypting your database in a truecrypt volume that must be
mounted by entering the password. Thus, the data is only ever saved on disk in
encrypted form, and the key to access the data is not saved on the disk. Of
course, it is still in principle possible for anyone to access that
information if they have physical access to the computer while it's running,
but at least this makes that much harder.

~~~
pavel_lishin
How fast is Truecrypt? How much would this slow down database and file access?

~~~
ineedtosleep
Truecrypt is significantly slower, especially on the higher strength
encryption methods. The program itself has a benchmark in it, so download it
and check it out for yourself if you're that curious. (Note that it is
relative to your hard drive's speed)

------
iqster
Turns out the server was not stolen!

<https://twitter.com/#!/instapaper/status/84106275796946944>

------
teoruiz
I can't help to compare this raid with the feds raid to the Novus Ordo
Seclorum hosting company pictured in Cryptonomicon.

~~~
lukejduncan
Every HN post can easily have a Stephenson reference.

------
jarin
Looks like the FBI is operating from the Department of Homeland Security
playbook now.

~~~
ktsmith
The FBI has been doing these kinds of raids for years and years, there just
hasn't been one in the news lately.

------
ChuckMcM
It would make for an interesting Freedom of Information (equipment) request.
"Give me my damn server back." But the damage is of course done.

If you are a voting citizen of the US I recommend you write (not email, write
a letter, put postage on it and everything) to your elected congressional
representatives and ask that Congress immediately put curbs on the police
powers of the FBI when it comes to infrastructure seizures.

------
bproper
You think it's a coincidence they nabbed Whitey Bulger this morning, after 16
years on the run?

His Instapaper account was probably full of stories about Santa Monica.

~~~
VladRussian
i think his Instapaper account was full of stories about Whitey Bulger and his
old friends/partners/etc... and this is how they "Big Data"-sifting-found him
:)

------
mmaunder
Contact the ACLU, they will probably take your case.

------
gokhan
What's the proper way of storing OAuth tokens in this situation? Given that
all the tokens of users and your private key is on the server (even if it's
embedded in code), there's no way for Instapaper for keeping those tokens
secure in case of a compromise (by FBI or Lulzdudes or anyone).

Seems like Instapaper should change it's private key for, say, Facebook.

~~~
roc
I would think encrypting the third-party tokens with the user's password would
be a decent start.

When the user's password is verified, it could be used to unlock those tokens
and store them in the active session structure in RAM. There'd still be some
exposure, particularly in the case of being rooted, but an attacker couldn't
just dump the database.

------
neckbeard
Update: <http://blog.instapaper.com/post/6854208028>

------
andrewcooke
is there a better solution that encrypting data and putting the password in
the source? obviously this is for cases where you can't use a hash.

it seems to me that, at least, it would make sense to have the db and web
server physically separate in that case (although i guess someone stealing
hardware is not normally a common scenario).

------
drjoem
i am wondering why these companies wern't using EC2?

~~~
seiji
That brings up an interesting point: can the FBI seize EC2 servers?

~~~
lwat
Of course they can. If they get a judge to sign their warrant they can seize
anything they please.

------
engtech
Julian Assange stated that the feds have backdoor, no court order access to
gmail, yahoo, facebook, et all.

Why worry about this?

------
bhartzer
yet another reason to make regular backups of your site.

------
leon_
Hmm. I've built something similar to instapaper for myself. (Using a native OS
X app). People were making jokes at me how I was re-inventing the wheel.

Now I'm somewhat happy having done the extra work. At least the FBI doesn't
have my "read later" bookmarks. (Which often consist of the words 'hack',
'malware' and 'reverse engineering'.)

I guess I will reinvent the wheel instead of using cloud services more often
in the future.

~~~
pavel_lishin
An in between solution would be better - write an open source version of
Instapaper that people could install on their own servers, instead of everyone
rolling their own.

------
gcb
who watches the watchers?

