Hacker News new | past | comments | ask | show | jobs | submit login
Moving from Google Appengine to NodeJS on Amazon EC2 (vpj.svbtle.com)
40 points by vpj on Dec 12, 2013 | hide | past | favorite | 13 comments

Without knowing more details, this reads like "our square peg didn't fit into App Engine's round hole". Pre-computing massive amounts of data on instance startup sounds like they don't have the correct mindset. Since every new instance starts from the same state, it must be true that whatever is being computed is, in fact, the same for each instance. This is a bunch of work which should be offloaded to a task once (or on a schedule) and saved somewhere like blob storage. Then each new instance can just load that precomputed data without having to duplicate the work that has already been done. I'm guessing that this lengthy computation is the search index which they reference elsewhere. They also complain that datastore reads are quite slow. Presumably if new instances are building a search index, they will be churning through lots of datastore queries which degrades performance and raises costs.

Bubblesort is slow on App Engine too, and you'd do much better switching to a different sort algorithm on EC2 as well. Every platform has a way to shoot yourself in the foot. Perhaps the solution isn't to move off, but to just not do that. Or, this blog post could be titled "How we didn't do our due diligence and chose a platform which was unsuitable for our problem space".

People make choices and when the choice turns out wrong you could always claim they did not do their research well. Maybe so, maybe these guys can be blamed for that too.

That is of not much consequence. The important part is that they evaluate their choice and are ready to move onto a better platform. Without destroying the business.

If you check out the blog, you can see they did it for jQuery as well. Good for them. You could also argue they should have made other decisions from the start. But hey, they learned and it works for them.

What is more, they tell the world about it and try to prevent others from making the wrong choices. Even providing pointers to resources that were useful for them.

What I trying to say, they deserve a bit more credit than an off hand remark like: "our square peg didn't fit into App Engine's round hole".

Appengine platform would definitely benefit from more written material about how to implement advanced scenarios.

The comment about bulk uploads/downloads is definitely a problem where GAE needs more work.

On the other hand, GAE forces you to write software that doesn't depend on long running instances, because real world servers do die. The limits about request sizes is also important because it helps guarantee upper bounds to latencies.

Sometimes it's appealing to just roll your own solution, and you might even get better results for when everything works smoothly, until you get stuck.

Of course, not everyone needs a system that is resilient to machine, power, network and datacenter, outages.

Just consider that once you need it, and you design your own solution to accomplish it, you might end up enforcing the same constraints on your application and probably won't get it right for the first couple of iterations.

I don't think my remarks were particularly offhand given that their post indicates a lack of understanding of how App Engine works. Based on that misunderstanding and poor architectural / implementation choices, they then go on to say that the data store is slow, App Engine creates too many instances, requests were timing out, and memcache is too limited. These criticisms of App Engine might be valid for certain uses, if it were not for the fact that the way they built the application is likely the very cause of their problems.

For instance, I'm 95% confident that the work they did upon instance startup was build the search index based on reading:

  "We HAD to use the search API for the last few months
   because we couldn’t keep our indexes in memory, because
   of the start up time (discussed earlier)."
I'll make some assumptions here because I don't have access to more information, but their search index is likely being built off of what is stored in the datastore. First off, calculating this all at instance startup is inappropriate. What if App Engine switches to a ZeroVM implementation where every request is effectively a new instance? They would be building an index just to throw it away right after. Or if instance lifetimes were limited to five minutes, it's the same problem. What if the amount of data in their datastore scales up by three orders of magnitude? Were they planning to burn through hours of instance startup time to build this index?

Not to mention that this index is most likely a deterministic function. In other words, given input data X, the result is always going to be output data X'. The fact that they recalculate the results of a deterministic function upon each instance startup is 100% wasteful. This is the direct cause of their request timeouts. Additionally, it's also the likely cause of the datastore slowness. As they state:

  "Pre-computations at start up kept the new instances 
   busy, and app engine was creating more and more instances
   to handle this."
So startup was taking such a long time because it had to (1) pull data from the datastore and (2) perform lengthy calculations on that data. App Engine saw the instance as being exremely busy therefore it spawned a new instance. Rinse and repeat. The datastore is going to get hammered with overlapping requests. No wonder it's slow. They may have also used a sub-optimal schema. It's been about three or four years since I've worked with App Engine, but I recall that their default datastore performed much better based on your choice of entity groups, etc. They should have used a background task for index building (if absolutely required), which is able to run for hours if needed. Store the results of the index build into the blob store. Retrieve that as needed without any recalculation.

Another thing they criticize is search API performance as being too slow. The current live application appears to do realtime search as you type. If they were implementing realtime search on top of App Engine's search API, I can see how that would be unsuitable as well. The search API is designed to take the results of a single query and fetch the results. If a search operation takes 500ms, that's not very fast but overall isn't a big deal. If that's 500ms per keystroke, then that will be massively slow. Searching upon hitting ENTER or clicking a button would have solved that. Or they could have re-thought their implementation if realtime is required.

App Engine is definitely not suitable for all problems, but it too deserves more credit than their blog entry gives it.

We started using app engine because it was easy to start with. We did some basic math on how much it'll scale with our architecture (which I too agree wasn't that good), and thought it would work reasonably until we grow quite big (at least half a million locations). I was/am new to app engine and was learning along the way.

The problem was when it started giving trouble even before 10K records, and not even more than 60 requests per second, which even a very low resource PC would handle without a problem since it won't even take a few milliseconds to compute (probably even without an index, just a sequential search). And we had to make changes to fix this.

It went on and on; every few weeks, users and content would grow and our app would fail. We didn't want to move away; just like you said, we thought the solution wasn't to avoid it, but to solve it.

Finally, after changing/improving the design a number of times, we considered using app engine backends to do central stuff such as maintaining the main index. At the same time, looking back at what we've been doing so far, it was quite clear that we were spending our time, which for sure we should have spent on building something that adds value to users, on learning some platform and trying to alter our architecture to fit into it. And we were going deeper and deeper in the hole, and we knew it would be hard to move.

Our vision is simple, and it has nothing to do with picking up some technology and figuring out how to make use of it. Instead, we try to start from the customer and work backwards. While on app engine, we once stopped taking new low paying customers (listings), until we fixed issues - I think this was a terrible.

Decision of moving away from app engine wasn't easy. We had to literally rewrite everything, and the fear of similar problems coming up was there.

Also, I never recommended anyone not to use it, I was just telling our story. In fact, I still use app engine for some work, and we would switch back to app engine if we are convinced that it's the way to give a better user experience.

About data store being slow, they charge us per data store read. It slowing down as the number of reads increase, for me, sounds like saying it's your fault if your calls drop because you are making a lot of calls.

About search API, we didn't use it for auto completion; we maintain a small dictionary for that.

Just to clarify we were building the index at the start up.

About scaling, we have given some thought to it. But not so much since it's not something we will require in the near future. We can create multiple instances and balance the load as long as the index is small enough to fit in memory.

I'm sure there is a way to get this working on app engine. But we are glad that we moved, and it runs smoothly. And more importantly, we have been able to give the users a lot more benefits during the past couple of months than during an year on app engine, because we had more time to focus on users. And if we had moved earlier, we would have been able to do more.

It would be interesting to see how you could have resolved the issues. I think eliminating the search index generation on instance startup would have solved almost all the issues you were experiencing. For comparison, Khan Academy serves up 6 million active users a month:


That's likely many orders of magnitude more than the traffic levels you're experiencing. Clearly the platform works, but you need to architect your application to work with it rather than trying to shoehorn App Engine to work with your architecture.

If you don't have the ability to do that (due to time pressures or other factors), then you made the right call to move to a platform you are familiar with.

While you're probably right about that, I always hear this about AppEngine - like there's some silver-bullet of coding discipline that will make it work. I remember when AppEngine did their big revision on their pricing structure - I think it was a "coming out of beta" thing and all these apps were getting stomped with cost problems... the recommendation from Google was to use concurrency constructs that didn't exist in Python, one of the App Engine's blessed languages.

This occurred at a time when they had Guido van freaking Rossum on payroll.

I've never worked with App Engine, but the constant discussion of working around its problems reminds me a lot of MongoDB.

I'd think that using back-end instance(s) to off-load the computational intensive work on GAE is pretty much the same thing as switching to EC2.

I'm guessing that Google Compute Engine wasn't generally available at the time of the decision, but you'd be able to solve most of the issues described with a well thought out front/back-end GAE/GCE architecture since GCE is formally launched now.

I'm glad the title of the article isn't some sensationalist junk like, 'App Engine should be shot in the face,' or 'Why I left the barren wasteland of useless App Engine and moved onto the greener pastures of my homerolled Accumulo/Scalatra stack on Rackspace'.

I think enough people who're reading this have had a good to fair experience with GAE and, like me, are wondering what hiccups other people are finding with apps on app engine. Subsequently, I also think that we like feeling smart because we've read enough docs and articles to be able to identify ways that their app's design may not mesh with the App Engine way of doing things.

The thing I need to keep reminding myself is that there are many right ways to do it. If vpj and his cohorts are spending less time trying to figure out App Engine and more time on their app, then more power to them. At some point you have to just figure out when the effort expended outweighs the potential benefits and cut your losses, and I think this post is written in a clear and fair way.

The thing that makes me upset though is that there isn't a central place that can clearly explains the pitfalls that he fell in, what the best way to do it on App Engine, and why. Those docs are kind of a mess, amirite?

Sounds like they had no problem moving, which proves yet again that GAE doesn't lock you in.

Actually it proves nothing of the sort, since in the process of moving they also re-implemented their app in Node.js.

This article confuses me . . . a better title would have been, "Why we changed languages, went from PaaS to IaaS, and changed our datastore - or why we changed everything". I find it interesting that the author didn't go from AppEngine to Heroku, which has robust NodeJS support, Memcache and Mongo support as well.

I applaud the author for making changes to support their business more effectively, I'm just not sure what I'm supposed to take away from this other than someone successfully changed some stuff.

imo it does not look that you needed to change infrastructure but that you needed to build a better application and tackle problems with better approach.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact