I don't know much about GAE, but a datastore-as-a-service that takes 2 weeks to delete your data and charges $300 a day to do so just seems... absurd.
Contrast with AWS:
We pay under a hundred dollars a month for extremely reliable and scalable infrastructure. I'm happy that our platform is designed and supported by the best sysadmins in the world - we've had less than five minutes of degraded service (high latency but still accessible) in the last six months. I'm soooooooooo glad we didn't choose AWS!
Appengine is perfect for developer-centric startups. You can concentrate on your app instead of the infrastructure until you've got traction, and if you do hit the big time you'll be popping champagne corks instead of server cores. At that stage you'll be able to make an informed decision about hiring some kick-ass sysadmins to build your own platform, and migration really isn't as big a deal as some people make out. Python apps are WSGI compliant so will run with minimal modification on any WSGI server, and you can access all your data without difficulty. You would need to write some wrappers around API calls, but this isn't a huge deal and realistically you're going to have to do this for any platform migration.
This is like saying, "Famous actors frequent my restaurant" and point to a picture of Ralph Macchio. OK, somewhat known, but De Niro, Brad Pitt, Clooney and co eat at the joint across the street.
Once you've worked out you have a product that people want you can start thinking about managing your own servers.
People complain about migration, but its really not that big a deal.
I learnt how denormalized data in the datastore can help speed things up for your app. I learnt sharding thanks to app engine. The pricing has made me make use of memcache more often and trying to avoid hitting the datastore. When I look at it, I feel I've brought in more discipline in my code because of app engine. I am now in the habit of building APIs that would run instantly. If a http call takes too long, my instinct is to make it run as a background task and then return the result.
I've learnt all of these because I've been on App Engine. I can see a significant improvement in what I'm building now. The apps we build now, all our users say it's "fast and responsive". A lot of that credit goes to App Engine and the things it has taught me.
Nothing you learn in life is really a waste of time.
This is akin to saying "Rails made me a better developer", which in and of itself is not bad or wrong, but isn't really terribly useful as a datapoint when deciding which framework to choose. Frameworks by their very nature (almost always) force concepts on you that improve your code. So too with GAE - by limiting some of what you can do, they can provide a more focused service.
Learning how to program a loop is a total waste of time which brings zero value to the end user. If I can hand that drudgery off to someone else, I'll be adding features and stealing your customers while you're giddy about nesting your loops.
Learning how to use photoshop is a total waste of time which brings zero value to the end user. If I can hand that drudgery off to someone else, I'll be adding features and stealing your customers while you're giddy about reducing your PNGs.
We set up on Rackspace Cloud in a couple of hours, and we're running just fine. We're lucky in that our core market doesn't give us traffic spikes.
If and when we run into scaling problems, we'll have a learning experience but that's the same whether you're on App Engine, Amazon or just a cheap web-host running PHP/cPanel/MySQL for £20/year.
GAE is not perfect but it's a much faster environment to develop on than IaaS services like EC2. I spend 100% of my time developing features and 0% of my time developing infrastructure. "DevOps" isn't even a job description in my present company.
It is true that there are occasionally some components (eg large memory indexes) that need to be "outsourced" to other parts of the cloud. It's trivial to do, but we judge each very carefully because of the added operational load.
It's also a great queuing service for running background tasks on an infrastructure that needs zero maintenance.
The cloud is the future and GAE greatly increases your power as a developer, so I'd say learning App Engine will make you a commodity rather than someone with unwanted skills.
If you really thing appengine is shitty or overpriced, then it should be pretty easy for others to replicate and eat their lunch. The fact that no one else has managed to yet says a lot.
As an anecdote, reddit frequently has outages, they run on aws. If they switched to gae they wouldn't, it's as simple as that.
Yep. Everything I've ever taken up in life I'm still doing X years later. I've never ever transferred a job to someone else, outsourced a task I learned, or hired someone to do what I do. Nope. 100% of the time I'm doing everything I ever used to. Still. WTF?
I think most cell phone service is shitty and overpriced, but it doesn't mean anyone can just set up a competitor - it's extremely capital intensive to compete at that scale.
Yeah, I forgot that GAE never goes down.
Reddit is one of the top 1% of the 1% in terms of traffic. I suspect GAE would probably have trouble with Reddit on a global 24/7 basis. Perhaps not as much as Reddit's having themselves at times, but nothing's perfect. And I suspect Reddit's pocketbook wouldn't be able to afford the marginal benefit GAE would provide.
* reddit on GAE would have no options if they ran into a feature which is too expensive on the GAE architecture. Reddit on EC2 has been able to change their backend architecture dramatically to deal with growth and new features — that's a really powerful thing to give up since it means you aren't limited to GAE's lowest-common-denominator functionality.
* Even assuming reddit could affort to spend kilobucks per day on GAE, how many good sysadmins and developers could they hire for that same amount of money? At their scale, the answer is a LOT of people - and that gives a lot more flexibility since money going to GAE is a sunk cost which scales linearly with traffic whereas people generally give better than linear returns.
This is not true at all. It's very easy to run a hybrid with parts of your application in other parts of the cloud. I do this. There's even a remote API stub so you have direct access to the full suite of GAE APIs from servers in EC2 or whatnot.
There are certainly limitations to what you can run inside of GAE, but there's almost always a good way to work around it while preserving the benefits of scalability, multi-datacenter reliability, etc. I'm not saying it's for everyone, but it's great for most common web applications.
The only real difference is that gae forces you to code for scalability while aws leaves with possibilities to shoot yourself in the foot, which reddit apparently does.
I'll say they're getting better now--for a while they were a major PITA for AWS folks.
Does Heroku have any particularly high-traffic sites?
Urban Dictionary, as an example.
Here's another one I haven't heard of before now:
Urban Dictionary is bigger (30MM page views/month), but Pulse is an order of magnitude bigger ("100Ms of requests per day" ). Pulse ships on a lot of Android devices by default now.
1. Migrate away from Google-hosted App Engine to your own AppScale instances.
2. Negotiate a better rate with Google.
In my experience App Engine has been fine for apps that only get a few hundred or thousand visits per day. The convenience of not worrying about the environment or deployment too much is a great enabler.
It definitely depends on the type of app. You probably wouldn't want to run anything that is particularly heavy on datastore use. Using AJAX rather than full pageloads is a good way to keep bandwidth and response time (and therefore instance hours) down. For example, one of my apps is able to pre-load the most important AJAX calls, so it can handle running with high latency and I don't exceed the free instance hours.
This, of course, is on top of the now rather ludicrous cost associated with maintaining a write-heavy application that must be both highly concurrent and consistent, with a large number of composite indexes and relational schemes that the datastore was simply not designed to cater to. That isn't AppEngine's fault, but it is by no means a generic hosting platform and I truly believe that their goal of creating an "infinitely scalable" and robust system is directly at odds with providing a product that is a good choice for the majority of applications.
I've worked with two clients that have extremely successful websites running on top of AppEngine and are happy with it, but the costs mentioned in the article are absolutely true and something that they chose to bite the bullet on for a one time transition. I can't imagine what it would cost to get all of the data out to do a migration.
Without seeing the code, it's hard to tell what's going on. I would suspect cascading ungrouped datastore puts and their index updates, but, from what's on the list, I can't even make an informed guess.
The point is, terming an infra stack as great as App engine as made for non serious applications is a gross misunderstanding of what the app engine is capable of. You should first give it a try, develop an app and then make a statement.
I am about to shut down one application that declined in popularity, because it costs me $20 / week to run it and revenue just dropped under $20 / week. The cost is not from instance hours, but purely from the stored data. Deleting the data from the data store would cost more than I could recoup, so that is not an option either.
Also I really feel frustrated giving hours of thought to something that should be a really simple operation. Perhaps .delete() should be free? After all, when I shut down the app, Google does delete everything for free.
In my experience, App Engine is mainly useful for in-house infrastructure apps for companies using the Google Apps platform. That or cheapskate developers throwing together a proof of concept / toy app in their spare time.
For those unfamiliar with GAE, a ListProperty is really a collection of properties. The author is using the property as a geohash with a significant number of values, plus he has additional multiproperty indexes defined, plus he's doing a rewrite (delete + write). All combined it appears to be ~460 writes per entity.
So what we're talking about is $6500 for 6.5 billion writes... exactly what is printed on the sales brochure. Is that a lot? Most datastores don't charge by the operation so I don't have a lot to compare it to. It seems expensive but not crazy, especially considering that the data is replicated via PAXOS to 3+ datacenters with automatic loadbalancing and failover.
So their implementation is a compromise on account of GAE's limitations, and they have to pay through the nose to use it. This is when I'd be looking at hosting some features outside of GAE, which is what we do with Full-text Search.
Geohashing is a reasonable solution for some spatial problem domains; it's one solution along the spectrum of "precalculate a lot up front and make queries cheap" vs "write in a cheap & easy format but make queries more expensive". Pre-calculation strategies are usually more scalable when you have large query loads, but they suck bigtime if you need to fully recalculate a large body of data (as the original blog author is doing).
Maybe the blogger would be better off using PostGIS; but then, scaling and synchronizing a large cluster of PostGIS systems is nontrivial. The issues here are too application-specific to draw any positive or negative conclusions about appengine.
But this bit stood out: $0.10 per 100k writes. That price seems to be far too high. The poster is doing (something like) a reindex of 10M entries (that kind of data is pretty small really: it's the kind of database you might use as a test set on your laptop interactively). Figure each modification is atomic, and that the b-tree height of the storage is ~4. So that's 40M writes to create an index, or $400!
Seriously? Again, this is the kind of task you'd expect to do quickly and interactively on your development box, and it costs a price of the same order as your day's salary (!) to execute in the cloud?
Looking at this from the perspective of the underlying I/O device: this index consumes just a tiny, tiny fraction of a hard disk drive's capacity. Yet creating it costs enough to buy the device several times over?
Something is wrong. Is that a misquote or have I misunderstood?
App Engine pricing might seem expensive if you try to do a simple table comparsion with alternatives, but when you get more deeply into it you'll find that a lot of stuff that is included in the service with GAE will cost you extra when you use the alternatives.
The only problem here is that "delete" is considered a write and when you want to delete data you just cannot accept the fact that you need to pay for something you do not want to keep. I think GAE should definitely look into this aspect and try to get some cheaper alternatives for data deletion.
Disclaimer: I work at Microsoft and am required by the terms of my employment to believe in "the cloud".
Correct. Of course, those are object writes -- Elastic Block Store disk I/O is 10x cheaper.
At $0.10 per 100k writes, 40M would be $40.
Also, reads, writes and small operations (which are the ones billed) are low level operations. An API operation actually translates into several low-level operations. And the way it is described  I think the poster is doing more writes to reindex 10M entries.
Considering that reindexing takes 1 write for the entity itself (existing put) + 4 writes for each element in the list property and considering that the poster has on average 18 elements in that list for each entity, then he's probably doing on average 73 writes per entity (I'm taking the "Existing Entity Put" scenario into account, otherwise for new entities it would be 2 + 2 per list element == 38 writes).
So by these numbers, that's 730,000,000 writes, or a cost of $730 -- if you go over them sequentially, only one time. But considering that he's doing manual full-text indexing, maybe he had to go over those items several times for the reindexing being done.
Maybe I'm missing something here. I don't know.
Unfortunately, AppEngine isn't forgiving of that and there is a real monetary value associated with questionable engineering design. Or, design that wasn't thought through enough in the context of a service like AppEngine.
This leads to a few people getting upset and making a lot of noise when the reality is that AppEngine is actually an amazing service.
So, to boil down the operator error from a quote in the thread:
"We're running a mapreduce to change the geobox sizes/precision for a large number of entities."
That is the real source of the problem. Instead of using geoboxes, they should be using geohashes, which allow arbitrary precision.
Instead of an indexed property that looks like this (what they currently have):
[u'37.3411|-121.8940|37.3395|-121.8926', u'37.3411|-121.8929|37.3395|-121.8916', ...]
They would have an indexed List<String> property that looks like this:
[8, 8f, 8f1, 8f12, 8f12a, 8f12ac, 8f12ac6, 8f12ac60, 8f12ac605, 8f12ac605f, 8f12ac605fb, 8f12ac605fb3, 8f12ac605fb34]
Finding if the location is in a box would be computing the hash from the lat/lng (there is free code out there to do that) and then doing an indexed 'in' query. The indexes would only need to be updated if the location of the entity changes, not when they want varying levels of precision.
First off, they mention that when the initial design decision was made, a similar operation cost ~$160, which is tenable for an operation that only happens once in a while. This is in fact a case of them getting bitten by the pricing structure changing after a reasonable design decision (at the time) was implemented.
Secondly, they mention that this is part of a larger issue:
"In our most common case we might have to add and delete
a couple items to the list property every once in a while. That would
still cost us well over $1,000 each time.
Most of the reasons for this type of data in our product is to
compensate for the fact that there isn't full text search yet. I know
they are beta testing full text, but I'm still worried that that also
might be too expensive per write."
This is a real problem that GAE needs to solve.
Finally, their problem doesn't seem to be that they need arbitrary precision, its that they seem to need fast location centric queries of a large database.
Geoboxes allow you to solve this problem correctly (and quickly), returning the results in the database that are closest to you. Matching on a geohash can end up serving the incorrect data unless you resort to hacks involving a number of queries.
2) They seem to have an extreme use case. No one is going to argue that maybe AppEngine doesn't fit the bill for them. Or, one could argue that doing 6.5 billion writes times a large number of customers, across multiple datacenters is something that a lot of databases would choke on.
3) Running more queries, while admittedly hacky is less expensive than doing more writes.
Before anyone gets the idea that the links in the parent are worth trying, they're not: the performance is absolutely atrocious – on our data set, 13 seconds per lookup.
We wrote our own approach on App Engine and now get stable performance on our datasets at ~300ms per lookup.
(We're doing Foursquare-type lookups.)
'We wrote our own approach on App Engine'
Like Foursquare, mobile clients send their current location, plus a search radius, and App Engine code returns a result set ordered by distance.
We were getting query times with realistic data set sizes in the ~13 second range from the code you linked to. With the code above, we're ~40x faster. YMMV.
I got burnt by AppEng, too. Picking AppEng as a platform is one of my worst technical decisions.
"Google App Engine is free to use during the preview release, but the amount of computing resources any app can use is limited. In the future, developers will be able to purchase additional computing resources as needed, but Google App Engine will always be free to get started."
You got 2+ years of use for 'free' and now they decided to turn it into a supported business model and are asking you to pay for what you use. Seems reasonable to me.
I don't disagree that they fubar'd their original pricing release announcement and should have had multithreading Python 2.7 for those folks.
But they did listen to the (loud) feedback, made adjustments and even apologized (were you there at the ThirstyBear meetup where they bought us all beers?).
Not having to hire an IT staff or be woken up in the middle of the night when AWS decides to reboot the host and your servers go down is worth its weight in gold.
Software are architected and designed with constraints in mind. Features are feasible or unfeasible because of these constraints. Cost is one of the big constraints. With the new cost structure, the apps have to be re-architected and redesigned, lots of things aren't possible.
I didn't go to the ThirstyBear meetup because I have given up on AppEng and moved on. I simply do not have trust for them to be a platform vendor.
I'm not saying it is easy, it isn't. It takes a skilled engineer to learn this stuff and make it work. But, when it does work, it really does work.
It sounds like you failed to create an application that took all of that into account and of course you are going to look 'real stupid' when your clients figure out that you took shortcuts. There is no way that could be the fault of AppEngine.
I'm curious where you went after AppEngine. Are you hosting on AWS now? Heroku? EngineYard? Do you have the same reliability and scalability as what AppEngine provides? Maybe that isn't something that is beneficial to you and in that case, I can see how AppEngine is not your cup of tea.
For me, AppEngine is amazing. I love the fact that I'll never have to hire a sysadmin. I love not carrying a pager. I love knowing that when my site gets an asston of traffic, I won't ever have to think about making it scale. I love not having to worry if my database is on big enough hardware, replicated across data centers, backed up. I'll never have to think about whether or not my OS needs an upgrade to plug a security hole or ssh'ing into a server at 2am to figure some esoteric problem out. To me, all of these things are worth the 'cost' of AppEngine. I'd rather spend my time adding features than doing sysadmin.
A skilled engineer would know not all problems are the same and one platform can't solve all the problems. You don't know what product requirements I had and you assert it's my failing since I couldn't beat my product into AppEng's square, despite the AppEng's square turned into a circle.
Just because your niche app happens to fit into AppEng's mold doesn't everyone else can do the same.
Google said that they adjusted their prices because people were getting wrongly incentivized to do things that were expensive at their end. And the way free market is set up, pricing is one of the ways that signals are sent to tell customers that they should make different optimization decisions --- including, perhaps, switching to a different technology which is a better match for their requirements.
Yes, that can be painful --- but it's the free market. You might as well complain that some silly folks moved to exburbs hours away from their work, and bought gas-guzzling SUV's, and then got upset when the price of gasoline went up to 3-4 dollars/gallon. Whose fault was that? OPEC? Or the consumer for choosing to live far away from work and to buy a car that had horrible mileage?
I don't think your gas example are relevant but if you want to stretch it, it just means Google like OPEC is not a trusted platform vendor.
I heartily agree with you about GAE being a niche. Not all apps (or developers) belong on GAE.
I ask again, if GAE didn't work for you, where are you hosted now?
"ww520: You were not trying to have a conversation. You were trying to have a bragging session. I don't find it constructive to continue the "conversation."
Then you come back and respond again? I'm so confused. Anyway, I'll consider this thread over, unless you actually want to have a real grown up conversation.
How is "must live in the Bay Area in order to receive friendly customer service" a reasonable hidden addition to the T&C of a supposedly global service?
I don't have the time or resources to move the site, so I'm forced to shut it down. It really, really sucks.
Personally I'm more disappointed by the lack of notice (1 month is nowhere near enough time) than the actual increase. I totally understand the need to charge.
I spent some time tuning SharedCount's API, which would have cost me $30-$50/day, and its now at about $1-$2/day.
- Move to Python 2.7 and enable multithreading
- Setup Cloudflare (this swallows about half of all my requests)
- Increase minimum latency and reduce the maximum number of idle instances. (I have 5-8s and 1-2 set, respectively)
- Setup the semi-undocumented Google edge cache (basically, just a Cache-Control: public, max-age=[seconds] header.
- Take advantage of memcache.
With this setup, I'm doing 3 million API calls per day at $2.
Also, high replication queries can return stale results unless you use ancestor queries. Ancestor queries require putting entities in groups by giving them all the same parent (which can never be changed). Basically it's a very inflexible semaphore and kind of sucks IMO.
Your suggestions in general are very good though. Thanks, I'm switching my DNS to CloudFlare now.
It's true that eventual consistency of queries on the HRD can be tricky to program around. On the other hand... your data is replicated to 3+ datacenters and failed over in realtime. Pretty rad.
(In fact, when I had to migrate my app from MS to HRD, it kept failing because I didn't have any datastore entities. The workaround was to just create a single entity)
Honestly it has more to do with how much time I want to spend on the site, how much it returns, and whether or not I should spend my nights and weekends transferring it to another host.
The last time I backed up all the data from GAE it took 4 days to download all of it to a VPS. 4 days to download all of it. Migrating means 4 days of downtime, or alternatively some complex solution involving posting all new data to BOTH places while the migration takes place.
That takes time and energy, and quite frankly I'd rather see someone with the resources do it right instead of trying to hack it.
So, yes you can build serious applications on GAE but like everything else it boils down to, it depends on what you really need.
It also casts the other users as the opponent, instead of google.
Sometime's it's still cheaper to have your own managed / self-managed gear... and from the looks of this pricing, even hire someone fulltime/freelancing to manage it all for you.
It wouldn't cost you any money (unless you have metered electricity), but rather just opportunity cost of being able to do other work with your resources.
Unless you only need a short-term lease on the equipment, cloud servers will be more expensive that dedicated/colocated servers.
Hardening the server is something that can be outsourced for a lot less than thousands. Services like linode seem to be a nice middle ground. While I don't see myself going back to my own hardware in a rack i run in a datacenter, I do still see the benefit of knowing your stack a bit beyond coding. Knowing how the stack works helps when building software quite often.
Anyhow, those are just my experiences. VPS' with a very strong toolkit to take the edge off self-administering like Linode, etc, seem to be a very nice option. Heroku has caught my eye too but they have completely different measurements.
I honestly do not get why people are so fascinated with the cloud. It's a very expensive way to avoid having to know what you're doing.
It's all about time, and lack of it. If you can spend 1/10th of the time and still make a good profit, you could spend the other 9/10ths doing other profitable things.
People requested custom SSL support at 2008, and today is 2012, if you still believe in App Engine, good luck!
>> the "trusted tester program" is a joke . They never respond so it's just a waste of time .
Even they launch this feature TODAY, so 4 years for a basic requirement, what you can expect from them?
If you really wanted onto the trusted tester program, you'd bring it up on the app engine mailing list or contact someone at google directly (their emails are all over the place, Ikai is a great guy, and they are very responsive). I'm sure they'd be happy to have enthusiastic beta testers.
I find great irony in your quote on your G+ profile:
"Do you create anything, or just criticize others work and belittle their motivations? -- Steve Jobs"
But as SSL support is not public yet, my statement above is still valid:
If your startup can survive without SSL support on your own domain, go for App Engine!
Sadly, this is kind of "typical Google" -- great product, decent execution, but a bad identity problem -- it really feels like they're not sure yet what they want to do with this.
Contrast this with running a stand-alone application server for each site, which is what GAE does. Here, even if your code is not serving any requests it's still waiting to get them. Now, GAE has powerful magic in it to retire request handlers which aren't frequently used. This way if site foo.com is getting 1 request/minute, it only really needs one process/thread/hander abstraction at a time. However, it is expensive to start/stop these "processes", so instead GAE is forced to keep this "process" around for a while after a request has been served hoping that the cost of keeping it alive would be justified by a second request. Thus these stateful, slow-to-start processes are always taking up resources that could be used to serve other requests.
Disclaimer: all my knowledge of GAE has been from reading their docs/blog, not from deploying projects to it.
Disclaimer 2: I am not saying that PHP is better/worse than GAE in any way. However, I am saying that the model that GAE uses is more costly for a typical application. This can be easily seen by comparing the cost of running a basic site on GAE vs $2/month shared hosting.
GAE has problems, but I think the root is just how unique everything is. That manifests itself in people using a datastore that they don't understand, with Google expecting them to know how many writes an action will take and whether that feels like the right number of writes or two orders of magnitude more than if they made a different decision about how to store their data and solve their problems.
It also manifests itself in the lockin that Heroku mostly avoids (which is a huge problem if some subset of users get to a point where they realize "whoops, this would be much easier if I could do things Google won't let me do, time to leave").
I think a good counterexample is Engine Yard and GitHub. Engine Yard had a somewhat limited offering (especially for what GitHub was willing to pay) that didn't really fit with GitHub's heavy direct disk I/O. (Most Rails apps almost entirely read and write from the db, but GitHub does a lot of direct operations on the git repositories.) But GitHub was still just a Rails app, not an app for some specially-designed Engine Yard framework. So it was fairly painless for them to decide to solve the problem in a way that didn't fit with what Engine Yard would offer them and migrate to their own hardware. It wasn't easy, especially since they weren't solving an easy problem, but at least they didn't have to replace their database.
I am not familiar with the internals of Heroku and don't know how they solve the problems I outlined. Maybe someone else can elaborate.
But they run normal applications. It started out being any Rack (Ruby web standard--Rails, Sinatra, etc) app; they've expanded into other languages now, but it's always some open framework that they're running for you, not something they own and keep proprietary. They give you a normal Postgres database. There are a few restrictions that you might not have on your own hosting (like a read-only filesystem). But you could basically take an app running on Heroku, install a webserver and Rails, install Postgres, make sure any config you had was the same, and run it.
Another huge advantage Heroku has is the ability to give other people access to their datacenters, since they're just running on EC2. So there's lots of Addon services that can add various pieces of functionality, like hosting a different database, many of which are only tractable because they're also hosted on EC2 and thus have very good latency to Heroku servers.
This means that you're not stuck with Postgres. If you think a part of your data, or all of your data, would be better stored in Mongo or Couch or Redis or flat files on S3, there are hosted services for that, and you can even deploy your own solution on EC2 if you'd rather. This leads to nice halfway solutions where you use Heroku to have super-scalable application servers, and maybe to manage your main relational db, but then you can tack on other things where that doesn't fit with your problem. Now, if you're running some of your own EC2 instances, you're losing some of the "never have to worry about hosting again" value of Heroku, but at least it's possible. It could be a temporary solution that keeps you above water while you migrate off of Heroku, or maybe you decide it really is the best long-term solution.
It's interesting, since GAE charges $0.08/hour per front end instance, with 28 front end hour instances free per day. However, I once helped someone debug an application that on average gets one hit a minute, and yet manages to max the 28 free hours and then some. Closer examination showed that GAE was having these instances hang around for much longer than seemed necessary after they fulfilled the initial request.
So either Heroku has more magical magic than GAE, or there is some other kind of efficiency that they are tapping into. One thing I can think of is that possibly Heroku is more conservative with spinning up extra processes, preferring longer response times.
Also, there are services out there that will monitor your queue depth and increase your dyno count for you. Or, you could use the heroku gem to do that yourself.
The value proposition of App Engine is that with no systems administration expertise you can rent an extremely reliable, massively scalable web platform that is managed around the clock by a world class devops teams. Unsurprisingly this costs money. If you don't need the reliability or scalability of App Engine, no one is forcing you to pay for it. But it's absurd to suggest that you can get anything remotely comparable in PHP for $2/month.
Second, while I understand the value of high availability and not having to worry about Ops, I am talking about cost per HTTP request. Here, the $2 PHP host is a clear winner. As I said, that does not mean it is better. Your analogy with the oil tanker and a row boat is applicable: one is more appropriate if you just want to cross the pond. The other is better for going from Alaska to California, with the caveat that GAE exists in the world where there are vanishingly few cost-effective ways to use it.
At any kind of scale, PHP's model of "start the interpreter, load the code, start executing" is abysmal. To make efficient use of CPU resources you need compiled code - JIT or otherwise. You often need to make use of instance RAM and pooled database connections. Long-running processes are vastly more efficient when running flat-out.
There is a reason Facebook invented hiphop.
start the interpreter, load the code, start executing
Hiphop is of course much faster.
By way of comparison, I've done quite a bit of work w/ the AWS components (e.g. EC2, S3, SQS) that in theory allow you to build a highly reliable, highly scalable site but in practice there's still a lot of assembly required, whereas w/ App Engine that's provided out of the box.
And with Heroku you can have it taken care for you, following a few simple rules.
So, why exactly would one use the crippled GAE platform, that constantly breaks its promises (re: reliability),
forces you to code with very little flexibility (and, no, not every app that needs to automatically and massively scale "has to be coded exactly like a GAE app anyway"), costs a fortune (and sometimes an unexpected fortune), and breaks for you as soon as you need a technology not on offer?