
Flickr Accidentally Deletes a User's 4,000 Photos and Can't Get Them Back - jamesjyu
http://www.observer.com/2011/tech/flickr-accidentally-deletes-users-4000-photos-and-cant-get-them-back
======
geuis
I keep hearing about this coming up, yet no one ask talked about _why_ the
data is irretrievable.

What it implies is that the internal tool(s) that the support staff uses was
designed to do deletion instead of status changes. I imagine they have a web
interface that fires off some backend script which a) runs sql deletes on an
account ID and all tables and b) fires off scripts that physically deletes all
photos across all servers.

Taking those implications in hand, it points to an incredibly poorly designed
system for account management. Some years ago, there was an internal planning
meeting where a decision was made to do this. Some people objected and were
overruled. The ramifications of that project manager are still sticking it to
people to this day.

I have a hard time believing that Flickr would have purposely engineered their
systems in the manner I've described. Not impossible, but unlikely.

So I think its more likely that the data is still there, but that customer
service has no method to resurrect it. Its probably a mix of internal policies
and the lack of devoting a couple of engineers to fix the problem that is
causing all of this.

~~~
potatolicious
> _"some backend script which a) runs sql deletes on an account ID and all
> tables and b) fires off scripts that physically deletes all photos across
> all servers."_

Neither should happen. You never _actually_ delete rows from your tables. You
_mark_ them as deleted, sure, but the data needs to be able to come back.
Space is cheap, there are simply very few cases where outright deletion from a
DB table is warranted.

I have a hard time believing Flickr would make a mistake like this. I agree
with you - odds are if they actually let an engineer loose on this problem
it'll be licked with a DB backfill in 20 minutes.

~~~
abscondment
I bet that in cases of copyright infringement a logical delete isn't legally
sufficient. Sure, you can leave the database pointers in place... but you're
going to need to actually remove the infringing content. In that case, at
least, a _true_ delete tool would be needed. Like any Internet discussion
about copyrights or other people's backend scripts, this is all purely
speculative.

~~~
InclinedPlane
I don't buy it. Simply making infringing data unavailable has to be legally
sufficient, except in cases involving removing secret or confidential data
(which would likely require a court case) there's no harm in having the data
merely inaccessible. Moreover, it simply _must_ be the case that this is how
it works in practice.

Consider how deep this rabbit hole goes. If expunging any copies of
copyrighted data were strictly necessary, then:

What of data existing on backup media? Should those be re-hydrated, the target
data expunged, and then recreated in abridged form?

What of data existing in various content caches (such as memcached)? Should
all of those systems be flushed of any possible contaminant?

What of the actual data on physical media? Deleting a file on any modern form
of media does not expunge the data, it merely unlinks the location of the data
on disk from the file system directory. It might be necessary, depending on
filesystem and drive type, to scan all of the unused sectors on an entire disk
to find out if the target data or any part of it existed.

As you see, quite quickly you get into absurdities. Nobody goes to that much
trouble to delete merely copyright infringing data.

------
lwhi
I think this is a huge problem with all cloud services.

As we start to rely on cloud providers to look after our data, we need to
either become more educated (and proficient) by creating regular backups
ourselves (which kind of makes the idea of managed cloud services defunct
imo), or be in a position where we can individually sue for damages incurred
by negligence on the part of the service company in question.

So many of these companies have liability clauses which negate all
responsibility in situations where negligence leads to loss.

I think the current situation is crazy, considering the amount of
responsibility and trust that's involved in using web-services which store and
manage irreplaceable data.

~~~
iwwr
A single service, even a cloud, is still a single point of failure. Diversify
your backup strategies.

Indemnification is likely going to push prices up further, making cloud non-
competitive compared to traditional storage.

Still, I feel a lot of old technology is merely being relabeled. Today
everything that stores data remotely is called a 'cloud'. We may yet see small
NAS units relabeled as 'private micro-clouds".

~~~
lwhi
I agree - I've recently learned of the importance of diversification first
hand.

However, I'd argue the ability to take legal action, would actually encourage
service providers to diversify their own backup strategies and procedures. If
a mistake becomes too costly to consider - I think less mistakes would be
likely to occur.

~~~
gst
Which would also make the products much more expensive.

And I have a feeling that this money does not necessarily go into making the
product more reliable, but that it is used to buy a better insurance.

~~~
space-monkey
You can't continue to get affordable insurance if you are continually making
claims.

------
gst
What a coincidence. My Flickr pro account expires on Feb. 15 and today I've
decided to move away from Flickr and not renew my account.

What I've done in the short term is to use FlickrTouchr
(<https://github.com/tominsam/flickrtouchr>) to retrieve my ~7 GB of photos
from Flickr and upload them to Dropbox. The Dropbox gallery feature works
quite well and has the advantage that you can easily backup/modify the
underlying files.

In the long term I'd like to move the pictures to my own server (allowing me
to use my own domain, etc.) and hack together a simple frontend similar to
Dropbox' gallery.

If you decide to move away from a Flickr pro account (and if you also have
non-public photos on your account) don't forget that after your account is
switched back to a normal account you can only access the 200 most recent
photos. So if you want to delete your photos from Flickr it's best to do so
before the pro account expires.

~~~
bigiain
This is fine if you're just using Flickr as photo storage and publishing, but
(especially for pro photographers) a big part of the reason to be on Flickr is
the social networking features. If you've got people linking to your photos
and photo stream, you can't easily duplicate that outside the Flickr ecosystem
with your dropbox hosted photos.

------
relix
The article pointed out that a lot of services "deactivate" an account rather
than delete it. I would suspect this common sense methodology is as obvious as
hashing a password, but apparently, and unfortunately for the artist, it is
not.

That's the biggest point I took from this. Removing the wrong account can
always happen, but actually wiping data instantly without an invisible "grace
period", or retrieval from backup, or anything else to get it back, that's
just poor. Especially if it's a pro account.

It sounds to me like they didn't invest much time in the internal admin
section. They probably delete a lot of accounts every day, and they've never
had an issue with the wrong account being deleted? Or an after-the-fact
clearing up of a misunderstanding? They don't start a dialog, at the very
least only with pro-users, to get the other side of the story?

I hope something else went wrong here as well, and it's not the way they work
normally.

~~~
jasonlotito
> I would suspect this common sense methodology is as obvious as hashing a
> password, but apparently, and unfortunately for the artist, it is not.

A few days ago we had a big thread on Undo in software development, and a lot
of people were suggesting Undo is very very difficult to implement with little
benefit to most users.

Then you have articles like this where people are shocked that undos weren't
in place.

------
jamieb
The guy has his backups. His problem is not the data. His problem is the
human-networking side of things: lots of places linked to his photos
(including the flickr blog) and those links now point to dead space. That will
reflect badly on him, I'm sure, and not flickr.

If I were Yahoo, I dont care how much it would cost, I would have someone, or
someones, recreate his account as it was, even if that meant calling in a data
forensics team.

------
WestCoastJustin
Seems like a funny policy to not backup production paying customers photos &
metadata. Think of all the photos, comments, followers, etc that just got
nuked! Seems like they rely on RAID or having multiple copies of the data
spread around for disaster recovery. Sure it protects you from a
server/raid/hardware failure but when a user/admin does something dumb it can
really hurt!

Doing a quick google search turns up this gem:

Staff - heather says: "When an account is deleted, photos and any metadata
associated with account is queued and deleted. If not instantaneous, it's
removed within minutes."

[1] <http://www.flickr.com/help/forum/47365/>

~~~
cosmicray
How about ... RAID isn't backup.

RAID is there for system interruption protection. RAID is there to keep flickr
cruising along without interruption.

If they delete content from a volume covered by RAID, then all the mirrored
copies go poof simultaneously.

What you are suggesting is a higher level content management system.
Apparently (from the drift of the comments) that is something flickr doesn't
want to get involved with.

------
ck2
Someday Gmail is going to have a catastrophic loss that's beyond parity bit
recovery.

------
pilif
The real dilemma is that as a service, you can't do it right. If you keep the
data around and just mark it as deleted, the EFF will post articles about you
violating the users rights for privacy and control of their data.

And if you delete the data, then $publication will write articles about you
being unable to restore it.

Take the Facebook "account disable" feature as an example. In the context of
this article here, it was sold as a "good" feature (so as something that
Flickr should have too).

But this exact same feature was all over the press a year ago as something
really bad and just showing how Facebook just doesn't want to let you go.

So. What is it now?

Personally, just flagging for deleted and living with the privacy concerns
might be the better option because there are potentially more people deleting
data by accident (or have it accidentally deleted) than there are people too
concerned about privacy to use the service.

Also, users concerned about privacy just won't be users whereas users who just
lost their data were users and now aren't, so that's a net loss compared to
the users not joining.

------
fedup2
Thing is, this is what happens to you if you complain at Flickr. They don't
like having to actually do work. That's pretty obvious from the fact that
after over five years of losing people's photos like this, they still fail to
back up anything. The founder even lost all his stuff at that cluster-f of a
social network gone wrong. What's funny about those slackers at Yahoo's Flickr
is that they think they can pull off hosting a porn site, while pretending
it's a family friendly place for your kids' photos. That may be fine for the
average user that wacks off at work, but not so good for an advertising
platform, as it turns out. You see, major corporations and small businesses
alike have an issue with their ads being surreptitiously placed onto hardcore
pornographic web pages without being told about that. Yahoo's Flickr can play
this game of hide-the-porn while tricking the general public into trusting
them. But those lies don't really fly in the advertising world on which Yahoo
depends. Everyday, several people get deleted from Yahoo's Flickr as that
company desperately attempts to make their porn site not really appear to be
one on the surface. The copyright infringement is the same thing on a smaller
scale. They count on people stealing your content, that's why they tricked you
into placing it all online in an easily accessible catalog of stock images
from trusting idiots. Make any kind complaint about the way they are doing
anything, and you're booted out mercilessly. That's just the way it goes and
Yahoo doesn't care one bit how you feel, because they obviously do whatever
they want to. They have the government in their pocket and free reign to push
porn into grade schools unlabeled, give your photos away for free to anyone
that wants them without liability, and harbor countless sexual predators,
pedophiles, and registered sex offenders, whom they cloak so they can be right
next to your children without anyone being suspicious. Can't really see
anything worthwhile about that website, or Yahoo in general. It's all lies
from them, and everyone eats it up with a clueless smile.

------
wmboy
I highly doubt the files are irretrievable. It's more likely it'd cost Yahoo a
whole lot of time (or a large cost in getting specialist contractors) and they
would rather just risk facing bad PR than coughing up the money.

Maybe they'll do the latter after some rethinking!

------
gnaffle
Although this is a terrible mistake on Flickrs part, and it shouldn't have
been possible to do this by accident, am I the only one that is happy that
deleting a Flickr account really means _deleting_ it?

Most other services like Facebook, GMail are explicitly designed to keep your
data forever, and if you don't want that, well, you should have thought of
that before using the service. I would actually be happier and feel safer if
we heard more complaints from Facebook users that accidentally deleted their
account and couldn't get it back. It would be a clear sign that their system
was designed with privacy and "data hygiene" in mind.

------
bl4k
This is why I never have a DELETE call anywhere in my app or admin code. I
always mark as deleted (there are performance improvements in this for some
databases as well)

~~~
danudey
I've worked at several places where the strategy is 'never delete, just mark
as deleted'. Your database tends to grow pretty fast, but you never have to
defragment your index or tablespace.

~~~
bl4k
I usually setup purge tasks (copy to same table in another db) every week or
fortnight in low-traffic periods

------
zachallaun
One can only hope that this is not quite over yet.

There's been quite a few well constructed and thoughtful apologies circulating
HN lately, one of which was Andrew Mason's apology for Groupon's Japan
blunder. Hopefully, Flickr will pick up the torch, and issue a more official,
and hopefully more well thought out, apology. After all, it's been less than a
day since the incident.

Of course, the deletion in itself is inexcusable. I just hope Flickr steps up.

------
marcusEting
Yahoo seems to be having trouble in general. First there are rumors that they
are shutting down Delicious. Now their photo sharing service is starting to
show real problems.

(Not to mention the Flickr API has a KNOWN ISSUE with the search that has a
fix ETA of several WEEKS! <http://www.flickr.com/help/forum/en-
us/72157625560721827/>)

------
delackner
This whole situation points at a market opportunity for a competing service to
offer users the ability to backup a real DB dump of their entire user, not
just "everything you uploaded", but actually a file that, provided later to
the service (or a competing service) contains everything needed to recreate
that account.

------
RuadhanMc
4 year subscription to Flickr Pro? That's not even a year for every year of
photos he has lost. I would think that if you deleted 5 years of a users
photos the least you could do would be to give them a life time subscription
to Pro and buy them the best damned camera available so that they can take new
photos.

------
jcromartie
<http://ascii.textfiles.com/archives/1717>

------
duke_sam
This is pretty awful. I have some sympathy for the support staff though, if
their tooling allows them to irreversibly delete someones account (and it's
history) they need better tooling. The realisation that they just deleted
someones 5 year old Pro account and 4,000 photos... oh shiiiiii

------
catshirt
"This wasn't much compensation considering Flickr Pro costs $24.95 per year,
and Mr. Wilhelm has already received a year's worth of Pro through his
participation in some events and competitions.*"

what else could flickr have offered to compensate?

~~~
kgermino
I would have asked for something closer to a full refund and possibly a
longer, if not lifetime, term of free usage.

It's obviously expensive for Flickr to offer something like this but I mean
they accidentally deleted 5 years of a paying customers work. I think a refund
is the bare minimum.

~~~
vacri
They have done that bare minimum. Flicker has already provided the guy with
five years of hosting and linking - that doesn't just disappear because it's
in the past. They provided the service that was paid for over those years.

Now they've made a mistake, and have refunded this years payment and added
future access for free. They have done more than 'the bare minimum'. It might
be good PR to refund previous years' fees, but they were at the time provided
in good faith and competant manner; there is no 'fair exchange' reason to
refund them.

------
Bitmobrich
No off site daily back up? I guess we can assume they have no disaster
recovery plan?

------
jamesjyu
Time for backupify.com?

~~~
there
anyone know of anything like that for downloading the data locally? backing up
data in the cloud to another place in the cloud just seems stupid.

i've seen some flickr-specific backup scripts that just dump all of the images
to a directory, but they don't preserve titles, dates, sets, collections,
tags, etc.

~~~
mechanical_fish
_backing up data in the cloud to another place in the cloud just seems stupid_

Why?

The difference between backing up your Flickr data to an encrypted drive at,
say, Amazon and backing it up to an encrypted drive under your bed is: Amazon
has better physical security, and far better redundancy, because (assuming you
use S3) the "drive" you save it on there is actually a replicated data store
that spans multiple regions and datacenters.

The point of backups is to provide redundant storage that is _relatively
uncorrelated_ to the original. Once you move the data to different disks at a
different company in (if you like) a different country, you've done a lot to
solve that problem.

~~~
there
i don't want to pay someone else to host a copy of my data when i have
perfectly good encrypted backup mechanisms in place. i want offline copies of
it to store, look at, modify, copy, etc. whenever i want, and the fewer
companies that have access to my private data, the better.

also, many of the reasons backupify.com cites for needing backups would also
apply to their own s3-backed service. "hackers", storage failure, human error,
TOS violations, accidental account lapse/termination, etc.

------
leppie
For $25 a year, I would expect them to have some backup policy!

~~~
rm-rf
I wouldn't. Replication yes. A logical backup? no.

------
privacyguru
Would someone really trust Yahoo/Flickr to all their images and not have
backups? Hopefully they didn't really care much about the images, else they
should have copies elsewhere.

~~~
pavel_lishin
"Someone" would, yes - we're not all as smart and paranoid as you.

Anyway, the artist has originals backed up, but the article points out that
all of the links to those images, and embeds, are now broken.

~~~
va1en0k
that's a good example of URL being an important part of the data and not just
a piece of metadata

If URL is constructed from the data somehow (auto-slug from the title, hash or
something), such restoration becomes way simpler

