
When App Engine Went Wrong - neilk
http://www.agmweb.ca/blog/andy/2286/
======
wkornewald
DeadlineExceededError is a really stupid problem in App Engine and the
simplest fix would be to just kill the instance after such an error and start
a new one. Google doesn't want to do that because it wastes resources. So, the
current solution is to write Python modules which, when they're imported, will
work correctly even if the execution stops in the middle of the module and
then is restarted from the beginning (i.e., when a new request is received).

It's possible that with the upcoming pre-warming requests this bug will
disappear in practice because your whole project gets a chance to import and
initialize all modules before the first request is received. If these pre-
warming requests don't have a 30s timeout your instance will always have
enough time to finish the initialization in a non-broken state.

Anyway, it's your own mistake if you stay with the Django helper. The bug can
be worked around. It's fixed in Django-nonrel. I still see people who try to
use the helper for mission-critical projects. Don't do that. The helper is
buggy. It uses monkey-patches all over the place. This makes it extremely
vulnerable to bugs caused by DeadlineExceededError. Just use Django-nonrel,
even if you only want to use App Engine's models.

------
dasil003
I don't have anything running on App Engine, in large part because Google is
not a service provider I can trust. There's a really tough dichotomy between
selling their magic scaling beans and the fact that anyone who needs said
scaling beans also needs real support—something which is simply not in
Google's DNA.

~~~
derefr
I'm surprised that Google isn't just selling a raw, white-box cloud-services
layer (like AWS) that other companies can then resell with convenience and
support on top.

~~~
snprbob86
Why are you surprised by this?

AWS was grown out of Amazon's internal systems. They productized what they
used everyday because they thought it would probably be useful to others.

Google doesn't offer anything like a virtual server internally. Everything is
built on their various abstraction layers. They are productizing what they
used everyday because they thought it would probably be useful to others.

------
vosper
I had a very similar experience, unfortunately it wasn't feasible for me to
migrate the site to another vendor, we just had to sit tight. It almost ruined
the launch, which had a hard deadline. Fortunately we too had an understanding
client, but being hamstrung was no fun at all.

The issue of the App Engine status page not reflecting reality is a very real
one - we had contact with a Google engineer who told me that "this is
affecting all our stuff" and I know that it went on for several days, but this
wasn't reflected in the status chart; the severity of the issue was definitely
not conveyed well enough.

I like the promise of App Engine, particularly the "instant" scalability which
is great for sites that you know are going to get hammered as soon as they go
live. But it really highlights the risk you take when you allow yourself to be
locked into such an all-encompassing platform.

------
endlessvoid94
Shameless plug: Djangy.com fixes all of these problems. We're focusing on
customer service + all of the other things that suck about app engine.

In private beta now, but if you're in need of something better than appengine
right NOW, email me and we can get you in.

~~~
StavrosK
Can you explain your infrastructure a bit (or is it a trade secret)? Even
though I sorely need a service like that, I don't like black boxes...

That said, I'd love to join the beta.

~~~
endlessvoid94
I'm reluctant to give up too much detail, but our infrastructure is very, very
scalable and secure. It's pretty solid, and allows for very fine-grained
control over many instances of any running application. Load balancing is
taken care of, as well as redundancy.

Once we go public, it's possible we'll be more open about the architecture,
but right now we're still iterating pretty rapidly, trying to make the user
experience dead-simple without sacrificing stability or scalability.

~~~
StavrosK
I see, thank you.

------
SriniK
django + appengine is something I am suspecting. Personally, I had good luck
with sticking with tornado + appengine. Tornado is pretty raw and atleast
keeps an option to port the code over if I have to. Datastore and dns are
still suckers in the performance.

------
grandalf
I'm looking forward to the current crop of ongoing errors being resolved. I
strongly prefer App Engine for a variety of reasons (built in versioning, pay
only for what you use, easy deployment).

Right now I'm just happy that some of the features that would have caused me
to put all my eggs in the app engine basket weren't available, because if they
had been I'd have felt a lot more pain from the outages.

------
danenania
Has google posted anywhere on these issues and what they are doing to resolve
them? I'm working on a site for App Engine and this has me a bit concerned.

~~~
f7u12
There's a good amount of fear mongering going on here. I've been using App
Engine+Python exclusively for side projects since its release and never run
into these problems.

------
code_duck
It's never been clear to me why people use the AppEngine for anything other
than learning and playing around.

Why not just get a $40/mo. VPS?

~~~
wkornewald
It's because a little VPS won't give you a fail-safe system. For a SaaS
startup you'll need to have several machines in a redundant cluster and
ideally you'll also have machines in at least two data centers. The people who
use App Engine for their business all have the same dream: Build the app,
click deploy, and have Google handle all the annoying server stuff. Obviously,
App Engine isn't quite there, yet. It's not fast enough, it's not stable
enough, and it has a limited feature set. Clearly, we're early adopters.
However, App Engine is going in the right direction and once the problems are
gone people will ask "Why not just use App Engine?" (or whatever other PaaS)
instead of "Why not just use a VPS?".

~~~
pjscott
Because I'm feeling pedantic today: you meant to say "a fault-tolerant
system". A fail-safe system is one which fails in a safe way. An example would
be a CPU that shuts itself down if it's overheating, rather than risk
permanent damage. If anything, a fail-safe system may be _more_ prone to
failure because of its cautious approach to dealing with faults.

There are some pretty cool examples on the Wikipedia page:

<http://en.wikipedia.org/wiki/Fail-safe>

~~~
wkornewald
OK, thanks. Will try to remember that. :)

