Hacker News new | past | comments | ask | show | jobs | submit login

Until they speed up the datastore by a factor of 10 the app engine just isn't a viable platform. They don't even use it internally for that very reason (this is coming from colleagues who work there.)



AppEngine is used quite extensively internally.


How do you reconcile that statement with the fact that tons of developers do actually use it? Myself included. ;)


We use it too. That's why I can make that statement.

A DB request takes over 100ms on the datastore. That's insane and unusable for about 90% of applications. Even their implementation of memcached is several orders of magnitude slower than actual memcached.


You claim that about 90% of applications require sub-100ms latencies on datastore writes. Can you back this up? On the contrary, it seems to me that very few applications need to be blocked by writing to the datastore, let alone cases where 100 milliseconds make them unusable.

Secondly, 3 millisecond memcached access is several orders of magnitude too slow? 3 nanosecond memcache access would be quite a physical achievement. ;)


If you're getting 3ms memcached access then you must be using a different app engine than I am.

And it's 100ms reads not writes. Writes are slower. Much slower.


According to the independently verifiable health dashboard, the current memcache latency average for today is about 7ms[1]. The average datastore get() time is around 50ms[2]. The average datastore put() time is around 100ms[3].

Are you accusing Google of falsifying this trivially reproducible and abundant data? That would be quite an interesting story. ;)

[1] http://code.google.com/status/appengine/detail/memcache/2010...

[2] http://code.google.com/status/appengine/detail/datastore/201...

[3] http://code.google.com/status/appengine/detail/datastore/201...


Look, if you're running a basically static site that serves a few 100k page views per day, then the app engine is fine and a great choice. Hell, a $20/month linode instance would do just as well. Get two $20 instances and you can handle all the burst traffic you'll ever need.

If on the other hand you're running a web app where you can't cache a decent chunk of your data, and you need to do 4-5 db reads per request, then you have a painfully slow system that your users will bitch about.

To take just one example, we've had to push all our db writes into task queues because we'd consistently hit the 30 simultaneous dynamic requests limit when we had more than 100 users on the site at once. To take another, we couldn't grab more than 2000 db entries in a request without hitting the 30 second http request time limit (which we were using to generate a csv dump of a chunk of a user's data.)

Over the last ~8 months of using the app engine we've found that we're just spending way too much time dealing with it's various limitations (no SSL for third party domains, 3000 file limit, slow ass datastore, no threads, no comet push...) that the benefits of easy deployment + sort-of-automatic-scaling just aren't worth it. So now we're moving over to linode + EC2 over the next few months.

p.s. we've found memcached to be around 10ms latency on average... who knows maybe our app is on a special slow instance? and yes, that's 10x slower than normal memcached.


I think you have many valid concerns about App Engine which are definitely worth discussing, but your over-the-top rhetoric and exaggerated numbers make it hard to do that!

P.S. have you considered AppScale to migrate off of Google's infrastructure? Google has been funding a project that is a complete clone of the App Engine API, but can run on arbitrary hardware using MongoDB or a plethora of other middleware.

I think we've hit the max comment depth, and I think it's time to turn on the noprocast settings again. ;)


I don't think anyone would accuse google of falsifying data, but stats can be misleading. I might have missed it, but I didn't see a standard deviation for those numbers -- if you're doing several reads and they happen to be "above average" then you could see the random appearance of sluggishness.

Just checking now (http://imgur.com/8Jb82) there are some really long requests (800ms!) which might get smoothed out in an average.


The time for a DB request depends a lot on the particular request. If you're querying by key then it's usually ~30 ms.

Their memcached is only "much" slower than actual memcached if you run your "actual" memcached in the same computer or some rack as your client servers. Google memcached vs external memcached have almost identical performance, it all depends on where are your servers.

If you were running your memcached servers in the same datacenter, but different rack, you will find out that you're only slightly faster or same speed than Google's memcached service.

I want AppEngine to improve its speed as much as you want, trust me, but with current limits is already possible to build a web application. We did it in Panoramio, and others did it too.


What do the error rates on write operations look like these days?

This ticket has held me back from giving appengine a try so far: http://code.google.com/p/googleappengine/issues/detail?id=76...


This was largely solved in 1.3.1. I quote:

Reduced error rate with Automatic Datastore Retries - We've heard a lot of feedback that you don't want to deal with the Datastore's sporadic errors. In response, App Engine now automatically retries all datastore calls (with the exception of transaction commits) when your applications encounters a datastore error caused by being unable to reach Bigtable. Datastore retries automatically builds in what many of you have been doing in your code already, and our tests have shown it drastically reduces the number of errors your application experiences (by up to 3-4x error reduction for puts, 10-30x for gets).


It depends a lot on each particular project. Panoramio and pubhubsubbub are two of the most well known Google projects built in AppEngine.


We're constantly working on improving datastore performance, but asking for a factor of 10 improvement is outside the bounds of reasonableness: that would bring latencies well under that of a disk seek, so unless you're prepared to pay to have all your data stored in ram, it's effectively impossible for reads.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: