Hacker News new | comments | ask | show | jobs | submit login
Switched away from App Engine, couldn't be happier (war-worlds.com)
133 points by codeka on June 3, 2013 | hide | past | web | favorite | 64 comments

I'm the tech lead for GCE. I'm sorry to hear that App Engine didn't work out for the poster. Perhaps someone from that team may have some suggestions. In addition, I'm happy that he was able to stay with the Google Cloud Platform.

With that being said, I'd really like to encourage the OP to store his data base (and boot his instance) from persistent disk. Running any database on scratch disk (without replication) is probably not a good idea. Even with hourly backups (make sure you are testing restores!) you still stand to have up to an hour of data loss and the pain of doing the restore if your instance should fail.

In addition, when using PD for all block storage you can start with a smaller instance. If you need more horsepower you can terminate that instance and boot from a larger instance with a minimal amount of downtime.

Can you give your colleagues on the App Engine team the number of the Dell sales rep you buy your servers from, because a quick comparison of App Engine vs. Compute Engine prices shows that App Engine is at best 10x more expensive per unit of RAM:

  | Instance Type              | MB-mem | $/hour | $/month | $/hour/GB |
  | GAE F1/B1                  |    128 |  0.080 |   57.60 |      0.64 |
  | GAE F1/B1       (reserved) |    128 |  0.050 |   36.00 |      0.40 |
  | GAE F2/B2                  |    256 |  0.160 |  115.20 |      0.64 |
  | GAE F2/B2       (reserved) |    256 |  0.100 |   72.00 |      0.40 |
  | GAE F4/B4                  |    512 |  0.320 |  230.40 |      0.64 |
  | GAE F4/B4       (reserved) |    512 |  0.200 |  144.00 |      0.40 |
  | GAE F4_1G/B4_1G            |   1024 |  0.480 |  345.60 |      0.48 |
  | GAE F4_1G/B4_1G (reserved) |   1024 |  0.300 |  216.00 |      0.30 |
  | GAE B8                     |   1024 |  0.640 |  460.80 |      0.64 |
  | GAE B8          (reserved) |   1024 |  0.400 |  288.00 |      0.40 |
  | GCE f1-micro               |    629 |  0.019 |   13.68 |      0.03 |
  | GCE g1-small               |   1783 |  0.054 |   38.88 |      0.03 |
  | GCE n1-standard-1          |   3932 |  0.115 |   82.80 |      0.03 |
  | GCE n1-standard-2          |   7864 |  0.253 |  182.16 |      0.03 |
  #+TBLFM: @I$5..@>$5=($3/$2)*1024; %.2f::@I$4..@>$4=$3*24*30; %.2f
No matter how good or bad the OP's GAE code was, that's a heck of a handicap to overcome.

I was hoping that eventually App Engine would be implemented on top of GCE, with all the services App Engine now provides available for GCE, and move to container‐based sandbox, like OpenShift/Heroku/Elastic Beanstalk, instead of application‐based sandbox. Especially now that App Engine has switched to instance‐based pricing.

For example there’s this old thing: https://docs.google.com/spreadsheet/viewform?formkey=dDRKNzd... (Google App Engine VM Runtime TT Sign‐up)[1]

Unfortunately, since all of this effort went into securing yet another runtime (PHP), I don’t think it will happen.

It would be great if at least it was possible to use Appscale over GCE or just the Datastore – which is by far the most attractive feature of App Engine, other than Task Queue – with custom deployment solution.

Google Cloud Datastore could deliver just that if you don’t mind paying twice:

    You should be aware that Cloud Datastore has a serving component 
    that runs on Google App Engine, so there will be instance hour costs.
[1] “If you are interested in using VMs with your App Engine applications in the future, let us know by signing up here.” http://googleappengine.blogspot.com/2012/06/google-compute-e...

I imagine Compute Engine prices will go up in the same way App Engine prices did.

That's the famous App Engine Golden RAM.

> Finally, App Engine has a hard 30 second limit on frontend requests. For the most part, this was fine. But certain requests started to take longer than 30 seconds, particularly when empires started getting larger

This is a very sensible limit. As your game is growing, more and more of these requests will pile up and will take resources away from quicker requests, making the game slower for everybody and crashing your architecture.

Adding limits like this is one of the very basic things you do when you need to scale. Changing platforms in order to get rid of limits like this will only mean that you'll have to re-add them at some point.

The correct fix for not running into such limits is to fix the code, not remove the limits.

Agreed. You don't want frontend requests tied up for a fraction of that. If you have long-running tasks then this is what task queues are for.

Or run partial computations throughout the day when possible.

Sounds like they're prohibiting worst practices. A fine idea.

If your requests need more than 30s, something is wrong. Most apps strive for <200ms responses. If you need a long time, rearchitect it. Run a background thread. Poll for completion or email the user when complete. Again, annoying, but your app will be better for it.

Saying that a legacy product sucks, but that you still need it to run is not an effective mindset. These platforms have constraints. Work within them and the app will scale and perform very well.

That seems like a rather arrogant attitude. You cannot force (what you consider) good practices on users/customers and expect them to be happy about it.

Additionally, the 30 second limit was increased to 60 seconds in October 2011. [1]

[1] http://googleappengine.blogspot.com/2011/10/app-engince-155-...

That is a very narrow-minded attitude. Not all apps have to scale and not all parts of an app have to scale. It may not make sense or be worthwhile to spend the effort to make a long running request shorter if it's used infrequently or by a small number of users or if the system as a whole does not require a high level of concurrency. Not every company is Google.

No. The developers are clever enough to decide for themself whether the code needs to be fixed. There may be some situations where it may actually mae sense to have 30+ s long requests.

No, when utilizing a REST protocol it is not desirable to have long requests. The correct way to do something that could potentially take a long time would be to process it in the background queued and then have the client poll the server until complete or to push it to the client when complete.

Having an open connection for 30 seconds + is bad practice, and App Engine is a shared resource. Arbitrary limits are sometimes bad but in this case it will deter poorly designed APIs from starting in the first place.

You can see how we implemented this in the GCE API with our operations resource.

We had earlier versions that required long HTTP calls and it was generally less reliable if something should happen during the call.

But the developer is in the best position to decide whether he should fix the code. Perhaps he's just testing something, that will be rewritten later. 30+ seconds requests are not always bad.

It has a cost (it takes time to fix it) and value (users won't have to wait). The developer has more information than Google to decide whether cost or value is higher.

Out of curiosity, can you describe a situation where 30+ second requests on a large-scale, high-traffic system is acceptable (in the sense that it is better to hold the connection waiting rather than erroring out)?

EDIT: long-polling aside, and even then, 30 seconds is a perfectly acceptable limit.

Streaming music and video? Realtime updates?

While I agree that a 30 second limit is to be expected in order to conserve resources, saying long requests aren't needed shows a lack of imagination.

Realtime updates (if you're not using Websockets) are usually done with long-polling, for which 30 seconds is perfectly acceptable. Socket.io, for example, uses a default of 20 seconds [1].

Streaming music and certainly video are generally delivered as chunks, with a few chunks per request that don't require long connections [2].

[1] https://github.com/LearnBoost/Socket.IO/wiki/Configuring-Soc... [2] http://en.wikipedia.org/wiki/HTTP_Live_Streaming

I don't have experiences with large-scale systems, so no. 30+ secs can be useful for smaller projects though.

I have a game running on GAE (www.runesketch.com), our initial design was really ineffecient. Using Khan Academy's mini profiler (https://github.com/kamens/gae_mini_profiler) we slashed our gets, puts and general overheads by two orders of magnitude. Our main cost, running a live game, we managed to completely remove gets and puts and run in memcache. We hardly scratch our daily usage now.

One of the overheads I did not realise that took a long with was using channels. We found them about 3x slower than a get or put. By parrallelising those with another thread we reduced our latency hugely. Our game went from taking 8 seconds to service a move, to 300ms.

Just my experience, did you work hard with a profiler before moving all the code?

This is basically a fundamental observation that applies to nearly all service-oriented-computing: unless you've collected the data of how your system operates on a platform, you have no basis to complain about the platform, except for subjective issues. And subjective issues are boring.

> Our main cost, running a live game, we managed to completely remove gets and puts and run in memcache. We hardly scratch our daily usage now.

Using memcache for store gameplay data is not the best idea. Memcache could be evicted at any time without any previous warning. Using a backend instance and storing the data in the ram is an order of magnitude faster, and thanks to the runtime environment api the service could stop gracefully.

I agree losing all games in progress is a pain, but our card game is meant to have a 5 min order of time investment, so its not a huge biggie that users occasionally have their games wiped. Annoying 2% of our users is sufferable at the moment

I totally bought into GAE when it was first released - evangelized it, wrote and adapted a bunch of libraries for it and ended up implementing a dozen or so projects on it.

The price hike completely killed me. I optimized what I could but there wasn't much to gain. Because of the way GAE is architected it forces you into efficient design.

I switched to AWS, and now with a lot more traffic across all the apps I am still paying ~20% of what my GAE bill was. I invested a lot of time into the platform, only some of which can be applied to other platforms, and fell for the bait and switch.

I really don't like what Larry has done with squeezing profit out of each business unit - the pricing just doesn't make sense for eg. implementing a simple spam filter for blog and forum comments using the Prediction API cost $30 per month alone for a site with thousands of visitors.

Completely killed a product that could be the center of making Google the best cloud platform for developers.

Hm. $230 a month can pay for a lot of dedicated servers. I am sure if you only need one instance for all of this it is far cheaper to just get a dedicated server at a hosting provider with decent bandwidth and reliability and run it from there. Should come in at around 50$ max. With that you could even do a distributed setup with two or three boxen at different geos.

$230 seems like way too much. There are much cheaper cloud services: https://www.digitalocean.com/pricing

That's the first thing I thought as well. Do the features of GCE really justify such a large difference in costs?

Any pointers on where you can get a reliable dedicated server for $50 a month? Rackspace (if I'm reading it right at https://www.rackspace.com/managed_hosting/configurations/) starts at $789 a month?!

Webhosting talk is a great place to look for info on hosting.

Personally I'm happy with Hetzner:

But I think you could find cheaper (but good) alternatives if you want hosting in the US.

I used Honelive for a while and I think you should be reasonably happy with them at that price range.

However, I eventually needed an upgraded server, most importantly the ability to do an out-of-band remote reboot. I switched to Hivelocity for that.

To get the best deals on hosting, look carefully at the hosting company's website, search for announcements of sales on Webhostingtalk, and check their Twitter and other social media.

For mid-range dedicated hosting, I've been using OVH and have generally been happy with them.


It's important to note that their North American data centers are actually located in Canada although I do believe they are planning on eventually opening a data center in the US.

Webhosting talk makes my brain hurt with so much noise vs signal to sort through :-)

But I did stumble across http://serverbear.com/ which seems to have some useful information, though many of the plans they show aren't actually available (out of stock) at the providers listed...

As the author concedes, it's difficult to draw any conclusions when he rewrote the whole thing in Java rather than python.

I am a bit surprised that 300 users costs him $10 a day though. There must surely be some giant inefficiencies in the code for that to happen.

I would be interested to know if he started using the traffic splitting functionality. That feature is a sure way to multiply the number of instance hours an app engine app is using. It doesn't seem right that the cost would just increase on its own the way the author described.

for the datastore ops, it maybe insufficient use of memcache.

for the front-end instances, i have no idea what can it be. I suspect the numbers are high because of "nice" global distribution of users.

If some requests are taking 30s to service, front ends will be spinning up very often because during those 30s, other requests will be queueing up

This is doubly true for an app that hasn't enabled python27 and multithreading.

Talk is cheap. You write a game with the same rules on his old platform, and then come back and tell us about the giant inefficiencies in the other guy's code.

Claiming to be better doesn't make you better

Some people believe that past experiences can be useful in helping you judge similar, but not identical, situations you encounter in the future.

I see you are not one of those people.

Can someone comment on data processing and statistics on GAE? Any good experiences? Any bad ones? I'm not an expert on GAE by any means, but I ported my app away from GAE to Django + MySQL because the app was data and statistics oriented (fantasy football) and I found myself wasting a lot of time on data processing, data cleaning, import, and export. Switching to a SQL database in this case felt like getting out of handcuffs (not that I know how that feels).

I'm using GAE for QueryTree (http://querytreeapp.com) which is a data analysis tool.

The GAE data store is really nice for persisting objects that come from users filling in HTML forms, or equivalent. One user can only type in data so fast and that kind of volume is fine on GAE. Plus you get great read scalability.

However, if you're creating data in some sort of simulation or loading it in from elsewhere, use a relational database (you can connect to external servers from your GAE app). Loading a few million rows into the GAE data store would cost you a fortune and the lack of joins or proper SQL probably means you have to get them all back out into memory again before you can do anything useful.

QueryTree uses the GAE data store as a central hub for user account info and settings, then shards lots of MySQL instances for actual data work, which seems to work well.

An SQL query can do just about anything, it'll do it right next to the data, and a lot of time and effort has gone into making that fast. For an app with data up to a certain size and usage, I'd totally go with SQL for all of the reasons you mentioned.

What worries me with SQL is that queries can do just about anything, and it'll do it right next to the data. That means that you want your database machine(s) to be hulking monsters, and sharding / replication gets complicated.

In my personal experience, the App Engine datastore exposes fewer and simpler operations which scale horizontally more or less perfectly. It's harder to write for initially, but it scales up incredibly smoothly.

I host a game on App Engine as well. I have about 3k DAU for which I pay about $3 a day. I can _almost_ pay for this in ads alone. Then the premium players I convert is profit!

The secret of affordable app engine is the pending latency slider. The default is CRAZY FAST. I have it set at min 5 seconds and max automatic. This is about 10x price reduction over the default settings. This prevents new instances spinning up until a request has not been served for 5 seconds.

The client replicates all the game rules so as far as the player are concerned there is no latency at all.

The game is called Neptune's Pride 2: Triton if any of you would like to have a look.


As a single developer, more interested in the game rules and player interaction, I would not switch to trying to manage my own servers for all the money in the world.

For those interested, right now I have the entire games save file stored in a blob property which I unpickle for each request.

I thought I would also jump in and say I have an old game thats still making me $300 a month that is running entirely on free quota.

Are they web games or mobile?

I'd stay away from GAE, personally. The lockin is not worth the benefits, though you will certainly find lots of happy GAE customers.

The big thing is that GAE is only suitable for certain types of apps, so make sure your's is one of those before even starting.

The one small, hobby app I wrote for GAE was unsuited and it was a painful process to port away. My company, with a real GAE app, has a cost threshold where we know it's cost effective to move from GAE; write heavy apps are expensive in the GAE environment. Each day we inch closer to that threshold.

At this point the headaches of using GAE are more than the headaches of not using GAE, but I won't miss it when we pull the trigger on moving away.

> The lockin is not worth the benefits, though you will certainly find lots of happy GAE customers.

> My company, with a real GAE app, has a cost threshold where we know it's cost effective to move from GAE

These guys may help: http://www.appscale.com/

If you can afford that deployment, otherwise you have to rewrite most of your code to get off the platform

I don't think the cost of deployment is that relevant when compared to the cost (and risks) of a rewrite.

If your app is written in Java, I believe http://www.jboss.org/capedwarf is another option.

For getting around the limitations of the GAE task queue, you may want to consider using PiCloud (http://www.picloud.com) to handle background processing. We have a number of users who have successfully used GAE for their front-end and PiCloud for handling background jobs.

For example: http://fwenvi-idl.blogspot.com/2012/07/cross-validating-neur... and http://neuroscience.telenczuk.pl/?p=435

> Similarly with backups: on App Engine, data integrity is gauranteed so backups are really not required.

How would you have dealt with a data corrupting code bug? If you don't have real backups, wouldn't those screw you?

> I ended up coming up with a simple cron task that takes a full dump of the database every hour and copies it to a "permanent" storage device. In the event of catastrophic failure, I should be able to spin up a new instance, copy my image to it and get the last hour's backup copied over in fairly short order. The main issue is the fact that I sleep for ~8 hours a day and work for ~8 hours a day, meaning it could be some time between when a problem occurs and I'm aware of it.

If you are just keeping the latest backup, you are still vulnerable to data corrupting code bugs. Generally for this kind of application you'll want to keep several backups, from several different times. Say, hourly backups going back 24 hours, then daily backups going back a week, and so on. The details depend on what kind of data you are storing and how important it is to your users that you can recover from data corruption.

If your requests takes more than 30 seconds, maybe it's time to setup an SSE or websockets interface. Or maybe do background jobs with polling.

I'd start with a profiler.

The eye ball profiler, preferably:

foreach for while if for if return


You did not find this company "over the weekend" because you "asked" about them 13 days ago:


Nice try.

Looks like someone attempting to get referral signups rather than a paid shill for Webfaction themselves.

Good catch, though. I personally will vouch for Webfaction (look no referral URL ;-)

I've been using WF for a long time for a small site. Their resource use limits can be a little stringent at times though.

If you keep running into the limits, I heartily suggest looking at switching away from their default stacks (especially when they use Apache -- uggghh) to uWSGI, for instance, if you're running a Django app. (`--http-socket :99999` or whatever the public port they assign your app will get you going easily).

EDIT: Comparing WF to GAE is slightly apples-to-oranges, though, as WF gives you a slice of a dedicated server -- GAE just runs your app without letting you touch the infrastructure, much like Heroku.

In this context, what is WF? (First thing to come to my mind was Windows Workflow Foundation in .Net)

What does this have to do with the article? Also: You found, became fond of, and signed up for their affiliate program all this weekend? Amazing.

Get your ad out of this thread.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact