
Moving from Google Appengine to NodeJS on Amazon EC2 - vpj
http://vpj.svbtle.com/moving-from-google-appengine-to-nodejs-on-amazon-ec2
======
biot
Without knowing more details, this reads like "our square peg didn't fit into
App Engine's round hole". Pre-computing massive amounts of data on instance
startup sounds like they don't have the correct mindset. Since every new
instance starts from the same state, it must be true that whatever is being
computed is, in fact, the same for each instance. This is a bunch of work
which should be offloaded to a task once (or on a schedule) and saved
somewhere like blob storage. Then each new instance can just load that
precomputed data without having to duplicate the work that has already been
done. I'm guessing that this lengthy computation is the search index which
they reference elsewhere. They also complain that datastore reads are quite
slow. Presumably if new instances are building a search index, they will be
churning through lots of datastore queries which degrades performance and
raises costs.

Bubblesort is slow on App Engine too, and you'd do much better switching to a
different sort algorithm on EC2 as well. Every platform has a way to shoot
yourself in the foot. Perhaps the solution isn't to move off, but to just _not
do that_. Or, this blog post could be titled "How we didn't do our due
diligence and chose a platform which was unsuitable for our problem space".

~~~
snrip
People make choices and when the choice turns out wrong you could always claim
they did not do their research well. Maybe so, maybe these guys can be blamed
for that too.

That is of not much consequence. The important part is that they evaluate
their choice and are ready to move onto a better platform. Without destroying
the business.

If you check out the blog, you can see they did it for jQuery as well. Good
for them. You could also argue they should have made other decisions from the
start. But hey, they learned and it works for them.

What is more, they tell the world about it and try to prevent others from
making the wrong choices. Even providing pointers to resources that were
useful for them.

What I trying to say, they deserve a bit more credit than an off hand remark
like: "our square peg didn't fit into App Engine's round hole".

~~~
biot
I don't think my remarks were particularly offhand given that their post
indicates a lack of understanding of how App Engine works. Based on that
misunderstanding and poor architectural / implementation choices, they then go
on to say that the data store is slow, App Engine creates too many instances,
requests were timing out, and memcache is too limited. These criticisms of App
Engine might be valid for certain uses, if it were not for the fact that the
way they built the application is likely the very cause of their problems.

For instance, I'm 95% confident that the work they did upon instance startup
was build the search index based on reading:

    
    
      "We HAD to use the search API for the last few months
       because we couldn’t keep our indexes in memory, because
       of the start up time (discussed earlier)."
    

I'll make some assumptions here because I don't have access to more
information, but their search index is likely being built off of what is
stored in the datastore. First off, calculating this all at instance startup
is inappropriate. What if App Engine switches to a ZeroVM implementation where
every request is effectively a new instance? They would be building an index
just to throw it away right after. Or if instance lifetimes were limited to
five minutes, it's the same problem. What if the amount of data in their
datastore scales up by three orders of magnitude? Were they planning to burn
through hours of instance startup time to build this index?

Not to mention that this index is most likely a deterministic function. In
other words, given input data X, the result is always going to be output data
X'. The fact that they recalculate the results of a deterministic function
upon each instance startup is 100% wasteful. This is the direct cause of their
request timeouts. Additionally, it's also the likely cause of the datastore
slowness. As they state:

    
    
      "Pre-computations at start up kept the new instances 
       busy, and app engine was creating more and more instances
       to handle this."
    

So startup was taking such a long time because it had to (1) pull data from
the datastore and (2) perform lengthy calculations on that data. App Engine
saw the instance as being exremely busy therefore it spawned a new instance.
Rinse and repeat. The datastore is going to get hammered with overlapping
requests. No wonder it's slow. They may have also used a sub-optimal schema.
It's been about three or four years since I've worked with App Engine, but I
recall that their default datastore performed much better based on your choice
of entity groups, etc. They should have used a background task for index
building (if absolutely required), which is able to run for hours if needed.
Store the results of the index build into the blob store. Retrieve that as
needed without any recalculation.

Another thing they criticize is search API performance as being too slow. The
current live application appears to do realtime search as you type. If they
were implementing realtime search on top of App Engine's search API, I can see
how that would be unsuitable as well. The search API is designed to take the
results of a single query and fetch the results. If a search operation takes
500ms, that's not very fast but overall isn't a big deal. If that's 500ms per
keystroke, then that will be massively slow. Searching upon hitting ENTER or
clicking a button would have solved that. Or they could have re-thought their
implementation if realtime is required.

App Engine is definitely not suitable for all problems, but it too deserves
more credit than their blog entry gives it.

~~~
vpj
We started using app engine because it was easy to start with. We did some
basic math on how much it'll scale with our architecture (which I too agree
wasn't that good), and thought it would work reasonably until we grow quite
big (at least half a million locations). I was/am new to app engine and was
learning along the way.

The problem was when it started giving trouble even before 10K records, and
not even more than 60 requests per second, which even a very low resource PC
would handle without a problem since it won't even take a few milliseconds to
compute (probably even without an index, just a sequential search). And we had
to make changes to fix this.

It went on and on; every few weeks, users and content would grow and our app
would fail. We didn't want to move away; just like you said, we thought the
solution wasn't to avoid it, but to solve it.

Finally, after changing/improving the design a number of times, we considered
using app engine backends to do central stuff such as maintaining the main
index. At the same time, looking back at what we've been doing so far, it was
quite clear that we were spending our time, which for sure we should have
spent on building something that adds value to users, on learning some
platform and trying to alter our architecture to fit into it. And we were
going deeper and deeper in the hole, and we knew it would be hard to move.

Our vision is simple, and it has nothing to do with picking up some technology
and figuring out how to make use of it. Instead, we try to start from the
customer and work backwards. While on app engine, we once stopped taking new
low paying customers (listings), until we fixed issues - I think this was a
terrible.

Decision of moving away from app engine wasn't easy. We had to literally
rewrite everything, and the fear of similar problems coming up was there.

Also, I never recommended anyone not to use it, I was just telling our story.
In fact, I still use app engine for some work, and we would switch back to app
engine if we are convinced that it's the way to give a better user experience.

About data store being slow, they charge us per data store read. It slowing
down as the number of reads increase, for me, sounds like saying it's your
fault if your calls drop because you are making a lot of calls.

About search API, we didn't use it for auto completion; we maintain a small
dictionary for that.

Just to clarify we were building the index at the start up.

About scaling, we have given some thought to it. But not so much since it's
not something we will require in the near future. We can create multiple
instances and balance the load as long as the index is small enough to fit in
memory.

I'm sure there is a way to get this working on app engine. But we are glad
that we moved, and it runs smoothly. And more importantly, we have been able
to give the users a lot more benefits during the past couple of months than
during an year on app engine, because we had more time to focus on users. And
if we had moved earlier, we would have been able to do more.

~~~
biot
It would be interesting to see how you could have resolved the issues. I think
eliminating the search index generation on instance startup would have solved
almost all the issues you were experiencing. For comparison, Khan Academy
serves up 6 million active users a month:

[http://highscalability.com/blog/2013/4/1/khan-academy-
checkb...](http://highscalability.com/blog/2013/4/1/khan-academy-checkbook-
scaling-to-6-million-users-a-month-on.html)

That's likely many orders of magnitude more than the traffic levels you're
experiencing. Clearly the platform works, but you need to architect your
application to work with it rather than trying to shoehorn App Engine to work
with your architecture.

If you don't have the ability to do that (due to time pressures or other
factors), then you made the right call to move to a platform you are familiar
with.

------
davidjgraph
I'd think that using back-end instance(s) to off-load the computational
intensive work on GAE is pretty much the same thing as switching to EC2.

I'm guessing that Google Compute Engine wasn't generally available at the time
of the decision, but you'd be able to solve most of the issues described with
a well thought out front/back-end GAE/GCE architecture since GCE is formally
launched now.

------
GilbertErik
I'm glad the title of the article isn't some sensationalist junk like, 'App
Engine should be shot in the face,' or 'Why I left the barren wasteland of
useless App Engine and moved onto the greener pastures of my homerolled
Accumulo/Scalatra stack on Rackspace'.

I think enough people who're reading this have had a good to fair experience
with GAE and, like me, are wondering what hiccups other people are finding
with apps on app engine. Subsequently, I also think that we like feeling smart
because we've read enough docs and articles to be able to identify ways that
their app's design may not mesh with the App Engine way of doing things.

The thing I need to keep reminding myself is that there are many right ways to
do it. If vpj and his cohorts are spending less time trying to figure out App
Engine and more time on their app, then more power to them. At some point you
have to just figure out when the effort expended outweighs the potential
benefits and cut your losses, and I think this post is written in a clear and
fair way.

The thing that makes me upset though is that there isn't a central place that
can clearly explains the pitfalls that he fell in, what the best way to do it
on App Engine, and why. Those docs are kind of a mess, amirite?

------
markdown
Sounds like they had no problem moving, which proves _yet again_ that GAE
doesn't lock you in.

~~~
coldtea
Actually it proves nothing of the sort, since in the process of moving they
also re-implemented their app in Node.js.

------
penland
This article confuses me . . . a better title would have been, "Why we changed
languages, went from PaaS to IaaS, and changed our datastore - or why we
changed everything". I find it interesting that the author didn't go from
AppEngine to Heroku, which has robust NodeJS support, Memcache and Mongo
support as well.

I applaud the author for making changes to support their business more
effectively, I'm just not sure what I'm supposed to take away from this other
than someone successfully changed some stuff.

------
tbarbugli
imo it does not look that you needed to change infrastructure but that you
needed to build a better application and tackle problems with better approach.

