
Microsoft's Sidekick/Pink problems blamed on dogfooding and sabotage - Flemlord
http://www.appleinsider.com/articles/09/10/12/microsofts_sidekick_pink_problems_blamed_on_dogfooding_and_sabotage.html
======
mrkurt
I think that's the most twisted definition of "dogfooding" I've ever heard.

Also, this article seems to be about 98% speculation from a site that has no
previous history of "inside sources" at Microsoft. Boo on that.

~~~
jay_kyburz
Yeah, I came in here to say that in my world "dogfooding" means using your own
application during development. To eat your own dog food.

~~~
Timothee
I agree as well on that definition. Which is what brought me to start reading
the article since I was really curious to know how Microsoft engineers using
Sidekicks could have caused any issue.

------
tlb
I don't think it's fair to rule out the possibility of a simple screw-up. I've
been responsible for important web services and I took a lot of precautions,
but I can't say it was inconceivable that something could have gone wrong
causing major permanent data loss. There are so many ways in which failures
can cascade in a complex system there can be no absolute guarantees.

Can anyone quantify how much address book data there was? As an outside guess,
1,000,000 customers times 1000 addresses times 300 bytes each is only 300 GB.
They could have kept an emergency backup on rsync.net for $300 / month. I keep
secondary, well-encrypted backups of things I particularly care about (not
nearly 300 GB worth) there, as well as on a USB disk under my bed, in addition
to the regular complete backups.

~~~
pyre
> _There are so many ways in which failures can cascade in a complex system
> there can be no absolute guarantees._

What about off-site backup? How many failures can cascade in a way to wipe out
off-site backups?

~~~
tlb
Many backup systems copy the data automatically each night, but if the data is
damaged on the main system and nobody notices right away it can overwrite the
backup. Or, in the stress of late-night recovery sessions, someone can copy
things in the wrong direction and overwrite the backup. Sometimes the backups
are encrypted with a highly secure key, and the key is lost with the original
failure, or stored within the encrypted backup like keys locked inside a car.

Because SANs allow multiple machines to write to the same disk, they have
byzantine failure modes where data can be overwritten on disk, but cached for
long periods of time by the machines that need it so there's no visible
problem until after a power failure.

When data is migrated to new database hardware, sometimes the backups haven't
actually been tested. Sometimes during the cutover, someone who didn't fully
understand the architecture backs up the new (empty) machine on top of the old
machine's backups, so there is no backup for a while. All these mistakes can
happen pretty easily when sysadmins are woken up in the night to fix a service
outage while people are yelling at them. I have made some of these mistakes
myself, though so far I've been lucky.

The computer industry has never learned to admit that there is always some
level of risk. You won't hear an oil company exec say "Inconceivable!" when an
iceberg crashes into a North Sea oil rig, or a refinery catches on fire.
Systems fail, and all you can do is try your best to avoid it and move forward
after it does happen.

~~~
pyre
A lot of those possibilities are preventable though. If you backup through an
automated rsync, why not used versioning (ala rdiff)? Why not periodically
swap out the disks on the backup machine so that if the disk gets hosed you
have _some_ data backed up, even if it is old data.

Some of the things like 'locking the keys in the car' can be mitigated by
making sure that you _test_ you backup system.

Obviously you can't prevent everything, but to say that some of these issues
are 'unavoidable risk' is like saying that an oil company exec would say,
"shit happens," to reports that one of their oil tanker captains crashed a
ship while drunk.

The _real_ problem is that backup is usually an after-though.

------
rit
Good article.

Once upon a time, I wrote a series of opensource iSync plugins for the
sidekick/hiptop platform, back when Danger first opened up their XML-RPC
Service.

Then it came out that in order to run in "production" (e.g. on a phone without
a developer provisioning loaded) your app had to have t-mobiles permission. We
kept repeatedly getting told "no" on the iSync plugin, that they had no
interest in supporting Mac. Then of course a commercial plugin appeared and I
got fed up and stopped trying, getting the clear impression that commercial
was what T-Mobile wanted.

Mostly however, it was the shock in realizing that you had no access to your
own data as a standard user. You were completely and utterly locked in to that
device, with no alternative.

~~~
ajg1977
Are you kidding? It's a terrible article that consists of nothing but
speculation and conflicting segments that attribute the data loss to
"dogfooding", and/or sabotage, and/or aggressive non-beneficial firmware
updates, and/or incompetence.

Well done AppleInsider, you managed to nail the problem simply by covering
every possible base.

~~~
rit
Fair enough. In hindsight: \- I'm at the point where my brain mostly parses
out all the speculation crap, I come across it so often. There was good
tidbits in there covering some of the contract issues, etc. \- I think the
"Good article" was almost a (can't think of the word[s] I'm looking for, but
something you say out of habit/reflex without thinking). It kind of came out
without actually considering other than "I found interesting things in it". I
apologize, you are in fact correct that it was incredibly speculative.

------
jsz0
Microsoft's insistance on rebuilding with their own technologies must put them
at a big competitive disadvantage. Google can buy up just about any small web
company and have 100% code compatibility from day one.

~~~
blasdel
That's not true in the slightest, unless that "small web company" is using App
Engine.

Microsoft has a hardon for dev-managed directly-addressed OS instances running
on x86 machines, so you could at the very least migrate to their extant
hosting infrastructure. Even if you have to rewrite your app in C# + SQL
Server, it's still going to be a direct gloss for most traditional webapps.

Google does no such thing -- everything is massively distributed at every
level, where blocks of infrastructure are managed as ideal services
independently from any application, and addressed at the datacenter level. You
are not going to get to manage your own machines, access a traditional
filesystem, use relational databases, a direct socket to the client, direct
access to internet hosts, _basically anything you take for granted_.

