I, and many others, spent a lot of time figuring out how to write apps that do it the "app engine way":
* Fast completes (30 second timeout)
* Offloading to task queues when you cant
* Blobstore two-phase upload urls
* Mail eccentricities
We did so, because we believe Google when they told us If you write your apps in this really weird way then we will be able to give you scale and cost benefits that you wont be able to get elsewhere
We believed them, because it seemed reasonable. We laughed at those who complained that django would hit the 30-second limit: "Its not a general hosting! Figure out the App Engine way!" And we educated on how to do it right, and many were happy.
Well, it turns out that it is general purpose hosting, with all of the costs, and yet also with all of the (once rational, now bullshit) idiosyncrasies.
But that's not the biggest complaint. The biggest complaint is that when my friends and peers objected to App Engine, its strange requirements and its potential lock in, they were right and I am a fucking naive idiot. And I really don't like to be proven a naive idiot. I put my faith in Google's engineers and they have utterly destroyed my credibility. THIS more than anything is the cost to me.
Amen bro. The biggest frustration was we followed Google's preaching and spent lots of time to fit our apps to their restricted model, then have them turned around to destroy it. I don't have faith to do another round of optimization to fit their new restricted model. What is stopping them to change the ground rule again. I rather spend time to develop to a generic model that I have more control.
I'd recommend GAE to people who are prototyping - it's easy to do simple stuff in.
But mostly, GAE doesn't make sense for larger apps. You can't buy your way out of trouble, by putting your db on a dedicated server with fast drives and tonnes of RAM. You can't really use relational data without performance and reliability issues.
It's not just about the "app engine way". It's not like learning C or Haskell, and having to find a new way to write the code. You fundamentally cannot do big ad-hoc database operations.
And consider this - it was July last year that they introduced the MapperAPI. Before then, I don't think you could do Map-Reduce without manually re-implementing it yourself (on top of the cantankerous Appengine Datastore). Just think about that for a minute - how were you meant to do stuff the Appengine way without map-reduce?
Anyway, I don't think your credibility was "utterly destroyed". It was really hard to know whether or not the learning curve was worth climbing until you had tried. You just had to judge the book by its cover, and the "Google" brand is pretty compelling to an engineer. It's not the first time someone has been fooled into buying something because the provider has a good reputation.
I have several ideas that would scale well on GAE and neither want not need a relational db. I'm not making do without SQL, I'm actively not using it and its very successful. Even when I move to EC2, I still wont be using SQL. In fact I have only one idea that needs any kind of relational data and that is so relational that SQL is a bad fit too. EDIT: Actually, sorry, all my data is relational. Its just that I put the (small) effort in to figuring out how to make it work without joins. In the last 15 years, I've not done a single project that could not have been done with a NoSQL database.
In fact if you look at the recent comments of certain GAE engineers, they seem to believe that GAE is precisely for scaling, and that's why it now costs so much: its only for the big boys.
The problem is that I can never become one of the "big boys" on their system, because pretty much as soon as I get any traction, I have to move to EC2 or heroku or go broke. Their new found belief in the scalability of their system is just arrogance. Anyone can claim to handle lots of traffic when you require that your customers run 20 times as many frontends as they should reasonably need.
Denormalize. Match your data rows to your access pattern (i.e. your UI). Naive example: if you have a webpage that displays a list of employees, and it must have their department name and boss in that list, you put that data in the employee row. What is the probability that a boss will change his or her name causing you to have update a ton of records? Very low. (not zero mind you, so you have to be able to do it). So why pay for the join every query?
There are no longer, in my view, any situations where a SQL db is the best idea. You either want a giant NoSQL database, or you want a massive in-memory object-graph using pointers. Or you want something for $20m from Oracle or IBM.
The problem is not having to update tons of records, the problem is seeing one day, after 2 years of having the app in production, that the listing shows that employee X works in department Y and her boss is Z, but Z is not the head of Y. Bugs happen and referential constraints go a long way towards keeping your data clean.
Yeah, I was worried about all that. Hell, I was worried NoSQL couldn't possibly work at all, given my experience of SQL and the joyful things that happen there.
Not saying shit cant happen. Now look me in the eye and tell me you never had some noob drop a constraint and forget to put it back.
You could periodically run a script that checks all the records for errors (especially embedded records that might have drifted from their current value, and not been properly changed by the app-level constraints), and automatically correct them (plus log the error).
If Michael Arlington changes his job from "editor in chief" to "founder, former editor, occasional contributor, and CEO of Arlington Investments", and his old posts aren't all updated, it's not the end of the world.
It really depends on the problem domain. You wouldn't run a bank's ledger off MongoDB. On the other hand, a bank's ledger should be radically simple, with little need for normalization.
What is the probability that a boss will change his or her name
That's obviously an example of something that will practically never happen, which is why it doesn't work all that well as a justification for ditching SQL databases altogether.
I've never used NoSQL for anything, so there must be a lot that I'm missing, and that's why I asked. But it seems to me like you'd be digging up necessary information through quite a few steps if everything is "flat".
On the contrary, its the SQL database thats "digging up the necessary information through quite a few steps" it just that massive effort required by the SQL server is hidden from you, the programmer, by a one line bit of text called a SQL statement. So you do it all the time. Indeed we've been taught that denormalizing is the "proper" thing to do because otherwise "Bugs happen and referential constraints go a long way towards keeping your data clean."
Digging kills you. I assert that SQL does the digging automatically, and thats exactly why it doesnt scale.
As I said in my original complaint: I, and many others, spent a lot of time figuring out how to write apps that do it the "app engine way"
That included learning NoSQL. At least that part was not a waste. There are no right answers to your questions, there are only right actions, starting with stepping outside the SQL box and writing an app using NoSQL. I started by thinking of a simple app that would be useful to me personally. I knew java servlets, I knew SQL, I knew all sorts of things, but after several iterations my app is architected like no app/server I've ever written before. Almost every iteration involved starting doing it the way I knew how, running into either roadblocks or major cognitive dissonance, and then rewriting it to fit these new-fangled constraints. Its been a huge learning experience. You might like to try it.
>> What is the probability that a boss will change his or her name
> That's obviously an example of something that will practically never happen
Women changing their name when they get married? A tiny assumption like that can make our software brittle. Now every model that caches the old name needs updating and you need to make sure there aren't any overlapping saves in any of those models that'll overwrite any items in your bulk update. If a single linked model has the wrong old-name cached, your data update process is buggy.
Now every model that caches the old name needs updating and you need to make sure there aren't any overlapping saves in any of those models that'll overwrite any items in your bulk update. If a single linked model has the wrong old-name cached, your data update process is buggy.
Well, that sounds like the kind of stuff I'd like the other guy to talk about. How does he avoid the bad sides of having all your data in a key - value store?
I would argue that all the forced scaling in App Engine makes prototyping harder. You can't use SQL. You can't reuse whatever open-source components you find on GitHub. You can't just let your app run slow and optimize later.
From what else I've read, it sounds like engineers who didn't also wear green eye-shades (or good enough ones, or who didn't possesses or use good enough crystal balls) set up this debacle. And it was people wearing green eye-shades (who we can sincerely hope are also engineers) who aligned it with reality. Causing way too many people way too much pain.
Object lesson: if you're going to sell a service for cash money to others, paying close attention to your costs from the very beginning is not optional.
The problem is how does one get from 31 cpu-hours to 879 instance-hours.
You might be thinking that in the original measure they did something insane like measure only user time of a process, or only when its executing a request, not booting or whatever (or fuck, I don't know because honestly there is no reasonable explanation). That is to say, that the 31 cpu-hours is a misread, and if the fellow in the article ran his code on EC2, he really would need 879 EC2 instances that day.
But this is not my experience. An extreme example: my app that served 14 pages was rated as taking 0.02 cpuhours, or 720 cpu seconds. This is entirely reasonable, if not excessive (because looking at the app it only took about 200 seconds including warmups). Under the new system, it is claimed that these 14 pages will require 2.8 instance hours.
0.02 => 2.8
31 => 879
So when the author of the article is told his app is going to take 879 instances hours per day, there is something seriously fucked up and wrong. It doesn't mean that the guy is running a realtime raytracing server. It means that GAE is horribly, amazingly, inefficient.
The app in the article serves 1.5gb/day and takes 879 instance-hours. What server would you need to do that on EC2: 1mb/s? The hourly cost on GAE is $1.46. Can I do that on a $0.085 EC2 instance? Yeah, I think so.
EDIT: My figures were wrong as I was comparing a $16 (wrong) figure to a $0.8 EC2 figure. The actual figure is $1.46, not $16. So I looked at the bandwidth/cpu numbers to see if a $0.8 EC2 instance is what is required, and I don't believe that it is. I think a $0.085 instance would be enough. YMMV.
It means that GAE is horribly, amazingly, inefficient.
This. We always knew GAE was inefficient. There's no doubt about that. Serving 30 or 40 requests per second would spawn quite a few instances and start producing request errors.
This is a load a 4 year old machine could handle with ease.
Why did we put up with this? Because Google didn't make us pay for the crappiness -- the pricing made sense. You don't pay Ferrari prices for a slow car...and during a surge it scales up gracefully. Go from 30 rps to 1000 rps and it'll just work. An old machine co-located someplace won't do that.
Now under the new pricing gouge Google is making us pay for their inefficiencies. All appearances are that this is what this really costs (plus some reasonable markup)...well that's pretty piss poor. Because we're essentially paying to haul cargo in a Ferrari and it's dumb dumb dumb.
It would only start producing request errors if there was bad coding. Many of Google's stuff (like the chrome updates) are done through GAE and receive no "special treatment" from GAE (except lifting the request limits which wouldn't affect any sites you make). GAE was made to be fast - and it is. It was made to be reliable - and it is (100% uptime in over 500 days).