

The Cloud's Hidden Lock-in: Latency - aristus
http://blog.archivd.com/1/post/2009/04/the-clouds-hidden-lock-in-latency.html

======
wmf
I would call this locality rather than latency, and it's hardly hidden. In the
pre-cloud world you probably wouldn't build an app that straddles two random
data centers, so don't try to do the same thing in the cloud. There is an
opportunity to make cloud migration easier and faster, though. For example,
incremental migration (rather than bulk copying) would be a great feature.

As for "Amazon charging less to put data in than to take it out", that's a
legitimate consequence of symmetric pipes with asymmetric demand.

I think the first instance of cloud peering was the link between Joyent and
Facebook; hopefully we'll see more of this in the future (especially between
specialized clouds).

~~~
aristus
Good example of peering! I'd missed that.

There's a difference with the pre-cloud world. When you ran your own
datacenter and some new tech came along, you could buy it from many vendors
and make it work (compatibility issues aside).

If all of your operations are in cloud A but only cloud B has some new thing,
you cannot use it until your vendor decides to build their own. The time and
cost to send data around is too high.

Amazon's pricing is justifiable but they are very aware of the implications.

~~~
wmf
_When you ran your own datacenter and some new tech came along, you could buy
it from many vendors and make it work. If all of your operations are in cloud
A but only cloud B has some new thing, you cannot use it_

Ah yes, this bundling is a big pain and will probably get bigger as different
cloud platforms diverge. That's why I prefer as-low-level-as-possible IaaS
where most of the interesting functionality is in software; this will probably
never be as granular and easy to use as PaaS though.

------
ShabbyDoo
I'd consider the issue of latency to be a switching cost vs. lock-in. You
could move all your stuff over to the new, shiny cloud. Of course, if Cloud A
is good for some stuff and Cloud B is good for other stuff, then you're stuck
in the middle (without low-latency peering as some commenters have mentioned).

In general, I don't see why people are scared of Amazon EC2 lock-in. You're
still writing a regular 'ol app that runs on RedHat (or whatever). How does
switching over to your own datacenter require any re-writes (other than config
changes, etc.)?

I understand the S3 issue, but you ought to be abstracting away the S3 details
anyway with a code layer. So, you could implement some sort of hybrid scheme
(S3 + new way abstracted away) easily enough if you must avoid downtime Or, if
you don't have that much data, just write a script to move it all over.

Queuing: If you're relying on some specific feature of SQS, you're doing it
wrong.

Payments: Ok, this gets interesting. But, it's low volume compared to most
computing activity, so you can afford the latency. It's not like your DB is at
Amazon and your app servers are somewhere else.

I'm more sympathetic to Google AppEngine lock-in. However, I don't think it's
too hard for someone to implement the limited set of interfaces provided by
the cloud (at least in the Java case -- I haven't examined the Python one), so
it's easily broken.

As Clouds become more abstracted away from hardware and the current ways of
doing things, I think lock-in will become a real concern. But, it doesn't
matter right now.

------
mrkurt
I suspect a number of the Cloud providers already peer with each other.
Amazon, Google and Facebook are already peering with some providers, and
possibly each other. Of those, I believe Google has way more peering partners
than anyone.

Mosso's tough because Rackspace does everything over purchased transit, they
don't really peer with anyone.

Simple peering isn't going to fix the latency problems, though. The solution
is latency tolerant applications. There are numerous ways to help apps be
resilient to latency. Things like datacenter local write through caches, the
multi-master replication that's easier with newer datastores (CouchDB for
instance) than traditional databases, etc.

The real latency problems happen between your assembled bits and your users.
Creative CDN use can help with this, and I expect that we'll see much smarter
CDNs in the future that offer some of what Akamai does without requiring
payments in live kidneys.

------
jerf
Solution: Locally-hosted clouds.

Few of the advantages of clouds are actually bound to them being over there,
instead of over here. If you want to outsource management, go ahead; the cloud
management company can reach in orders of magnitude more easily than you can
move your data around out of the cloud. Remote backups to another cloud, yours
or somebody elses, aren't that hard either. (The remote cloud would still be,
well, remote, but it would be better than nothing while you're waiting for
your local replacement servers to arrive.)

The only reasons that clouds currently must be hosted remotely is that A: it's
slightly easier and B: the cloud owners _want_ you to be locked in. A would be
easy to overcome with some software if people wanted it. Everything else about
the entire cloud idea works just as well if the cloud lives in your own server
room or your choice of colo; easy server replacement, task migration, easy
upgrades, whatever, it's all just software, not magic.

But the cloud owners don't want you to realize that.

(Once again, I think RMS is way ahead of the curve, and I think the next five
to ten years will prove he was right.)

~~~
ShabbyDoo
I worked for a large web company where the busiest traffic day of the year was
20x a normal day. And, the busiest hour of that busy day was 100x normal. So,
they had a whole bunch of excess data center capacity hanging around to handle
just a handful of days a year. For them, the value of not owning the cloud
(along with on-demand provisioning) is that they can significantly reduce
hardware usage.

~~~
jerf
Locally-run clouds can rent their capacity back out to somebody else who
doesn't care about these issues as easily as they can migrate services and
push data around (again, all depending on software not written and with a lot
of people who don't want to see it written). Definitely another advantage to
outsourcing your cloud operations.

If you're getting a gut reaction against the idea of people running around on
your hardware, that exact same gut reaction should be applied to your stuff
running around in somebody else's cloud.

~~~
ShabbyDoo
We outsourced the physical data center but owned and operated the servers.
There would have been no issue with other people's stuff running, and the
company had even considered it (for simulations, etc.). Our costs were much
higher than Amazon's, I suspect. While we might have bought a couple hundred
servers at a time, Amazon probably buys thousands. They can buy bandwidth
cheaply and probably even power.

Open standards are nice and all, but worrying about running a datacenter
stinks. It's a huge management overhead among the zillions of other problems.
Even signing the contracts to have it managed is a pain.

"Locally-run clouds can rent their capacity back out"

I'll buy capacity from Amazon, but probably not from Joe Blow. More management
overhead to sell capacity back. There would have to be an intermediary, but
even then it would be a pain.

Amazon's lock-in/switching costs are mild problems compared to the issues of
dealing with one's own datacenter. Everybody I worked with would have much
rather spent their time making the apps better.

------
pj
It's always fun to find ways that you can make things cheaper, more efficient,
more automated. That includes reducing latency and bandwidth costs.

Look at javascript minimizers. Y-Slow...

~~~
ShabbyDoo
A cloud user can't reduce latency. He can reduce the number of round trips,
but the network latency is basically out of his control.

~~~
pj
Content Delivery Networks reduce latency.

~~~
ShabbyDoo
They do, but they basically cache (although I'm told Akamai is allowing
servlet-ish stuff to run at the edges now). If you want to make back-end RPC
calls between data centers, you're eating at least one round trip.

