Hacker News new | past | comments | ask | show | jobs | submit login
Why We Moved from Amazon Web Services to Google Cloud Platform (lugassy.net)
361 points by jganetsk on Aug 5, 2016 | hide | past | favorite | 246 comments



Saw this article the other day, it didn't get upvoted much - I'm not sure why it's getting to the frontpage now, it really isn't saying anything at all. I agree with the sentiment though.

I haven't been dealing with AWS very long but I keep hitting limitations. Stupid limitations.

Sometimes they're a big deal (IPv6 is not available natively. WTF, doesn't AWS run half the web?). Sometimes they're not a big deal, but still annoying (DNSSEC isn't supported by Route53). Sometimes, they're unbelievable (AWS Lambda is Python 2.7 only, no Python 3, in 2016).

Sometimes it's probably something that would be fixed with a single line of code (Ed25519 is not supported as a keypair). Sometimes, the product itself just sucks (Cloudformation), requires a Ph.D (IAM) or has ridiculous limitations (certificate manager). And sometimes, it's just the aws cli that, in between its million subcommands available, cannot do basic things like creating S3 redirect rules or update ACM certs.

But all the time, it's the web UI that sucks the most - I have to use a separate browser for cloudwatch logs and S3 because the native UI is just that bad. And god forbid you enable 2FA, you'll have to get your phone out twice a day.

These are all the limitations I hit in a few months using AWS. I wonder how many more I'll hit within the next year. Is Google actually any better? Seems to me it's just different kinds of limitations.


Don't get me wrong, I love AWS. I use ~15 products almost everyday. I actually like how powerful IAM is.

But yeah, CloudFormation just sucks. It does. It's painful because it's a really good idea but the configuration format is inadequate for the task.

CF should have been implemented using a declarative DSL similar to Puppet from the start. I've been tinkering with the idea of creating a DSL with a JSON translation layer but I always assumed AWS would eventually move away from JSON anyway, making the effort futile. Four years later and still no improvements. Definitely my biggest disappointment with their platform.


Am I insane, or are you and the commentators below unknowingly describing Terraform?

https://www.terraform.io/


I'm glad I'm not the only one who thought cloudformation was difficult.

Terraform is great so far. I am running into some duct-tape issues with resources shared between two terraformations, though.


You're not insane!


Plenty of those DSLs already exist, the cfer project[1] that I contribute to and SparkleFormation being two I can vouch for. I don't find CloudFormation particularly annoying or difficult, and certainly better than the alternatives (BOSH/Terraform/etc.), given such a wrapper, and cfn-init with CloudFormation metadata is miles ahead of "well, install Consul and connect to the cluster and eventually reach the state you want to be in" for my uses.

[1] - https://github.com/seanedwards/cfer

[2] - https://github.com/sparkleformation/sparkle_formation


I maintain a ruby dsl as well at http://github.com/bazaarvoice/cloudformation-ruby-dsl

The format is a weak argument against cloud formation. It has some negatives. This one is easily overcome.


And I salute you! We heavily use the Ruby DSL.


As already said, you should use terraform which has its own DSL (called HCL). Inspired by JSON but way simpler and more appropriate for this specific task.

Important tip: If you use one of jetbrains IDE (IntelliJ/PyCharm/...) there is a free plugin to handle terraform files.

It gives the usual syntax highlighting + auto completion of variables and devices + detection of errors. It's very very nice!


I have to state that configuration is not easy. CF might suck, but imagine something easily to replace it and works better is close to be wishful thinking.


Terraform


> And god forbid you enable 2FA, you'll have to get your phone out twice a day.

How onerous, a security mechanism you actively turned on actually requiring you to do something. My car is just as bad - I drove it at a wall, and the front bit crumpled up!

And if you're getting your phone out twice a day, then perhaps your work hours are a bit crazy (12 hours+) and the problem is less to do with AWS than your workplace?


My Google account is far more secure than my AWS account, but it can actually remember device-based 2FA for a month. Yes, how onerous, Google accounts are LITERALLY like a crashing car.

And as someone who works at home, I work whenever the hell I want, get off my back and go troll elsewhere.


>it can actually remember device-based 2FA for a month

Call me crazy, but that seems like a totally different feature than what AWS two-factor auth is doing. AWS uses 2FA to validate users, while Google uses it to validate devices, and you're making it seem like Amazon is too dumb to use local storage for something.

(Also, I don't think the parent commenter was comparing Google to a car crash. The analogy was that it seems weird to get mad about features performing their stated actions.)


Far more secure how?


2FA that is long lived and only protects the device does not sound as secure for something this critical.


Wow. My apologies, I didn't realise your personal self-worth was so wrapped up in your google account. I wasn't even referencing google with the car thing, but the AWS mechanism doing what it's been activated to do.

If you're that sensitive about lighthearted hyperbole, then perhaps you shouldn't use it in the first place.


I work 12 hours apart in one day all the time. I work an average of 16 hours a week, from home. Stick to the actual discussion rather than trying to invent nonsense reasons to dismiss people's opinions.


[flagged]


Please comment civilly and substantively here or not at all.


> My apologies. I didn't realise that you found it so difficult to take out your phone. I find it comes quite naturally to me - I was unaware that there were professional developers out there that struggled with it.


> you'll have to get your phone out twice a day.

I pull out my phone at least 6 times a day for that just to switch between accounts.

Welcome to the wonderful world that agile/lean methodology has given us. They've delievered shit just to say they delievered something. They won't actually give you what you need until they get enough backing for it.


It's ridiculous to blame agile for "delivering shit" (I'm not a huge agile fan neither) - after all it always comes down to the people involved not the methodology/process applied.

A though experiment: do you blame democracy, because Trump is now a nominee? If you have lousy people, choice of process does not really matter.


Actually that would be a perfectly valid criticism of democracy. A dictatorship, for instance, is more robust to a generally lousy populace, though it suffers from other problems of course.


> They've delievered shit just to say they delievered something.

You're right.

This never happened before the agile manifesto was published.


It is possible to link accounts so you get a simple dropdown that lets you select the account you want to use without logging out and in again.


Yes, some limitations and bugs are just ridiculously interesting. As a customer, I know my TAM would tell me this is much an organizational issue. One team pushed a feature, but the other team wasn't ready so the feature would just hold up in a pending state for months. I can't disclose which new feature's coming out, but essentially, yeah that feature won't come out until the other team gets their product updated.

Another classic example is Directory Service's security group can only be found if you go to VPC / EC2 console page to search. The SG isn't even mentioned in the Directory Service console. I bought reserved instances for RDS and EC2, one of them accepts custom tagging, so you could name the purchase, but the other doesn't.

If you use CloudFormation you want to make sure the resources are managed by the CFn. Your manual change could break the next CFn update. In some cases, rollback is not possible and will require AWS Support's intervention. Another example, say you have a cluster of 3 Cassandra nodes, you would build 3 EC2 from a single CFn stack. Well, time for server rotation (by that I meant taking the instance out and replace with a new one), one node at a time. Too bad you really can't out of the box. You have to build another stack with 3 instances, and then start configuring one node and swapping one node at a time plus changing the underlying dynamic inventory we got out of Ansible.


Is there any particular reason you need IPv6? I get the whole IPv4 exhaustion thing, but if AWS can figure out a way to get me an IPv4 address when I need it, this seems preferable to dealing with IPv6.


IPv6 is provably faster, especially for mobile clients. iOS 9/10 also prefer it and the App Store has begun requiring apps to work on full v6 stacks, meaning any end points in AWS will require 6to4 somewhere.


From my experience, some (Virgin Media in IE) ISPs with IPv4 CGNAT have a slower/less reliable IPv4 connectivity than with native IPv6.


I don't understand the AWS fetish either. I guess nobody ever got fired for using AWS. It's the IBM of the cloud.

If you need managed services Azure and Google Cloud are largely better. If you just need compute and transfer Digital Ocean and Vultr absolutely destroy AWS.


Would really love to hear what makes Azure "largely better". My own experience is far from that. I have yet to meet anyone not relying heavily on moving over a MS stack or that has gone beyond the "hit a template to deploy a test" use case that really likes it. When I get folks down to the brass tacks, it tends to boil down to "I don't have to pay the overhead of the Windows license anymore" or "I like having MS services as a PaaS".

On the other hand there are a lot of inconsistencies and odd limitations on the services.


I echo your sentiments. I run Azure in production at my job, running both Windows and Linux stuff and constantly run into weird edge cases, inconsistencies and stuff that just doesn't make sense.

There's really only one thing I can think of that I love is resource groups. It's nice to be able to group together everything in a deployment, for quick access and administration.


DigitalOcean and Vultr destroy not just AWS on network traffic costs, they destroy all three of them by an absurd margin. I have no explanation for that huge gap. It's the great mystery of cloud computing.


The huge gap is easy: there is significant security in place at the big 3 that DO and other vendors simply do not have.

Google's security team is probably as big as DO's engineering team.


That's not why <cloud provider> networking is so much more expensive than a VPS provider... All of the big providers have fairly serious networking investments (and I think it's fair and honest to say that Google's is the largest) that provide a network quality and performance you won't match with any VPS provider. Do you want to stream more than 100 Gbps during the Olympics to people all over the world? Don't try that on OVH, DO, etc. but you can do it on us. We (Google) in fact charge more than AWS for our networking because of our massive global network.

tl;dr: you get what you pay for, and you don't always need our crazy network!


I understand that in principle, but before paying the big cloud providers between 10 and 20 times what I would be paying Vultr/DO (for network traffic), I would like to know a great deal more about the actual performance difference.

I know it's probably on me to find that out myself.


There's an entire industry of companies and analysts willing to sell you that information :)


Interesting... so with Google and Amazon and the like I am paying more for the existence of capacity that I could hypothetically utilize?

DO and Vultr are colocated at very large sites. For many locations they sit right next to carrier hotels. I am a bit skeptical of whether the real world difference is that huge, especially if I sprawl my endpoints across both these providers and many locations.


I've never heard of Vultr until today. Have you used it? Can you tell more about it?


Samething as DigitalOcean without the hype.

5$ SSD VPS, 768M of Ram instead of 512 at DO, the reason I moved in at first.

The popularity clearly helps DO keeping up with more 1 click apps and other features than Vultr, but they recently made available reserved IPs and Object Storage, so they continue to be a good alternative to DO.


I am using Vultr, but the particular project I'm using it for doesn't put a whole lot of strain on their infrastructure yet.

So far it's been working well and support was very responsive on the one occasion when I needed their help.


As others have said, similar to DO. It has a few more options available, though, including better support for booting your own OSes if you don't want to use a prepackaged one, and dedicated instances (last I checked DO didn't do either, but haven't looked for a bit, might be out of date). I've also found their support to be more willing to work with me to get things running how I want, but that's obviously just anecdotal.


The thing I really like about Vultr is that to use their site, you just have to enable Vultr.com (in NoScript). For DO, you have to enable 15-20 domains because they use all kinds of outsourced services, and every time you go to a new page, there are new things that have to be enabled. I got tired of it.


The gap is that you are expected to serve most of your traffic through their blob stores and CDNs, not through the compute instances.

The traffic prices for each of these are very different.


In that case I can just use S3 for blobs and use Digital Ocean and Vultr and Linode for compute. Amazon charges $0 for bandwidth ingress so uploading to S3 costs me nothing.

We use VPS providers for compute and S3 for backup and large blob storage. The only other Amazon product we use is Route 53, which is a decent DNS-as-a-service and is very reliable.


Of course. The trick is to decide what mix of lower price and lower complexity is your sweet spot.


Very much depends what you are actually doing in the cloud.


Disagree on compute, actually - specifically, AWS Lambda is great. Its lack of Python 3 support certainly is the thing that pisses me off the most across the whole AWS stack but that's also because it's my favourite there. It's fantastic for on-demand distributed computing, and really cheap.


Azure functions seems similar enough and a bit easier to adapt, though the runtime is windows based, so it's easy enough to include other executables.

What I'd really like to see is Compose.io's full product lineup on Digital Ocean already. As it is, I'm considering seeing how well Azure Tables, and Azure SQL work from DO SFO2 to Azure West US...


Its simple - moving to the cloud at all is a big risk for some executives. They don't want to compound the risk by going with anything but the leader.


The Lambda UI used to be atrocious: You couldn't resize the multiselect for subnets.

My main nemesis now is the ElasticSearch dashboard. Just opening it gives me a 50-50 chance of my browser tab crashing.


Why do you think Cloudformation sucks?


Not the parent, but for one, doing a CF update is all-or-nothing. Either you roll the whole thing out, or you roll the whole thing back. If you have a lot of servers, and things are going down, you want to just STOP. But you can't. Also, if you click too quickly, sometimes you can get into a weird state and not be able to rollback OR update anymore.


I think this behavior is sensible. Additionally, you can disable rollback. However, cloud formation has needed and lacked a "dry run" option for a long time. It would be really nice to have that. However, the documentation does indicate what effect an update on any given resource will have.

I recommend against making sweeping changes to your infrastructure without thoroughness. Without saying this is what you're doing, I see a lot of developers who get involved with cloud treating changes whimsically. I'm not saying you should write perfect code. But thoroughness and a solid procedure (and backup plan) is a good place to start.


dry run for cfn == create change set :-)

(also, it helps to use nested stacks)


Same thing happens with other tools like Terraform though: once you've kicked off an apply, you are committed.


yeah, but having the option to run 'terraform plan' really helps you know what you're getting yourself into.


We've had Terraform take action that wasn't in the plan, in production.

I call it "pray and apply".

https://news.ycombinator.com/item?id=12214879


BOSH is able to cancel operations. You press Ctrl+C and it asks if that's what you intend.


CloudFormation change sets make CF updates a little less painful/mysterious:

https://aws.amazon.com/blogs/aws/new-change-sets-for-aws-clo...


changeset is very limited. when you have sub stack, change set cannot detect thing change in sub stack :(


Because massive json files with pseudo-invocation syntax and a ton of gaping issues that aren't readily solved by pre-existing template solutions pauses for breath and just bizarre decisions about holding and fielding and really poor integration with new services (you can tie ECS clusters in a knot with CF trivially) and API call limit hits KOing CF provisioning jobs and...

I mean, what it's doing is hard and I use it over, say, Chef... but it's not a "good" product. Heck, you cannot even truly validate CF templates without running them. The CLI and builder just do a crude structural check pass. Got a case error in a nested structure? Too bad, the entire job is rolling back.


While we're on the subject, I want to mention terraform (https://www.terraform.io/).

I'm not going to straight up recommend using it, but I will recommend checking it out. It's certainly a lot better than CF, but it still has really glaring issues and very much feels like alpha software (see discussion here: https://news.ycombinator.com/item?id=12213935 and some of my complaints here: https://news.ycombinator.com/item?id=12214358).


I am on the opposite fence. I also comment on the recent Terraform discussion. Basically the abstraction layer is too complex; why not template JSON and build the cloudformation stack? In the end, nested CF stack is still, a hard problem.


Why do you think the abstraction layer is too complex? I found terraform was essentially designed with AWS as a first-class citizen (if not for AWS in the first place), which means everything maps really well.

I strongly dislike HCL, but that's essentially what it is - templated JSON.


There are a few issues: stacks sometimes partially complete and you are left with a bit of a mess; stack creation has insanely long timeouts and a lot of retries, so it often sits there for 15 minutes before eventually failing; and deletions can fail part way through because there is some dependency that isn't obvious (and which they don't make it easy to find). I still like it (infrastructure as code is cool), but it needs more work.


[flagged]


It's clearly not a popular view, but Google have definitely got a lot more professional at promoting their products through HN and similar sites.

I doubt they're that different from Amazon or other companies we regularly see at the top of the front page though.


I'd prefer not to attribute to malice what could as easily be coincidence.

The biggest factor I see in the rankings of posts on various sites is the time of day in which it is posted and then directly related to that whether or not it received the critical mass early such that it can slingshot out of /new and start building an engaging discussion.


> I'd prefer not to attribute to malice what could as easily be coincidence.

Speaking as a former employee of another tech monolith specializing in marketing ploys, I can tell you unequivocally that this is a naive position to take.


As someone who worked at a company that used almost all of the GCP products, I agree with just about everything in this article. GCP is pretty amazing (and simple) to deal with. They offer so many features that work very well within their platform. They have a similar 99.95% SLA on most of their services, and they often automatically apply credits for missed SLAs (YMMV, this may only apply when you have account reps paying attention).

The major downsides that I've noticed are: 1) Documentation is lacking (but improving!) 2) Issues that aren't affecting a lot of customers can sometimes take a long time to resolve. 3) Many services (including App Engine Flexible Environments) are still in beta, meaning no SLA, and they recommend against using them in prod. Unless you have a big paid support contract you'll have no clue how soon (if ever) things will reach GA.


I agree with all of your point except for the beta woes. If I'm building a production product I would much rather they be upfront about when products are still "beta" and under heavy development as opposed to shipping a buggy and unreliable product that I only realize is in that state after I stick x GB of data in it or start making x req/s.

For example at a previous job we were "early" adopters of Amazon Redshift and it gave us no end of troubles. That should definitely be labeled "beta" until they sort those issues out.


And they have very few locations of datacentre



Oregon, Iowa, South Carolina, Belgium, Taiwan definitely is fewer than the 11 AWS regions.


This article is very hand-wavy (I agree with some of the comments here), but as a heavy AWS user here are things that bother me

1. Networking -> I completely agree with the OP here, The networking on AWS needs to be better. I don't want the strongest machine just to have a better transfer rate. It makes complete sense to have a micro machine for some services, but if those services are accessed or access other HTTP/s services, it will be unnecessarily slow

2. Pricing -> I have the privilege of working at a company that can afford to pay 3 year in advance. Even if you do that though Amazon will keep you on the machine types you purchased and not on the newest parallel machine types. In which case if you paid 3 year in advance you are often "stuck" on previous generation instance types.

3. Someone mentioned CloudFormation here. I completely ditched it in favor of terraform, CloudFormation looks like a tool from the 90s after using Terraform (which itself isn't clear of flaws as well)

4. In terms of VPC and networking I completely disagree with the OP, the networking and security settings on Amazon are great (if you understand them). You can define instances that have absolutely no access to/from the outside world. If you build a secure service, some of your services living in a "sandbox" makes total sense

5. App Engine -> The amount of complaints I heard about this service over the last couple of years are just insane. I heard from multiple users of the platform that it sucks. While I have absolutely zero experience with it myself, I tend to listen to people that suffer from it daily.


App Engine is a joke. It was an interesting and novel idea at the time but it is not something you want to pick up today (or last year, or the year before that, or the one before that either).

GAE is a massive lock-in with outdated software in a world where there's so many better, cheaper, non-lock-in alternatives, even within Google itself.


Do you feel the same way about App Engine Flexible Environment (neé Managed VMs)? That is, is your complaint that GAE Standard offers you all these services (like Memcache, Task Queues, Datastore) or that you hate the sandbox it's in?

The GAE team has been (admittedly slowly) pushing the services that were GAE only into being fully-consumable "Cloud Platform" services (e.g., Cloud Datastore is the same Datastore that's been "part of" App Engine forever, but now accessible from anywhere).

It's your choice to use Datastore, and I respect that you would chose to avoid it, but the basic PaaS idea of "here's some code, run it for me as a web endpoint" is still compelling even today.

Disclosure: I work on Google Cloud.


My impression of GAE is that on top of the core issues of the platform's (lock in by design, outdated by design), it is also very much unloved by Google and has received very few updates. I stayed away from it so I could be wrong.

But to answer your question, I don't know anything about that Amazon service but I would feel the same way about similarly-designed competing product. IMHO, no informed person would ever pick proprietary, lock-in-heavy PAAS solutions.


> IMHO, no informed person would ever pick proprietary, lock-in-heavy PAAS solutions.

This sounds like your personal bias and is clearly not true. Business doesn't work like this and you would be surprised if you looked at just how much proprietary software runs the world.

Lock-in isn't a thing to fear, it's a natural spectrum of using any service. The more specialized that service, the more work will be involved if you need to move away.

The real question is if the risk is worth it. Do you really need to switch? Are you worried that Google or AWS will somehow disappear before your business does? If not, what is the big deal?


I did preface with "IMHO" (in my humble opinion). I'll give you it's not very humble, but it's an opinion.

You're conflating two things though: Lock-in as a whole, and unnecessary lock-in like GAE's. Yes, lock-in is a choice to make and not always an incorrect one. But I can safely say no informed business should/would pick Adobe Flash over HTML today for, say, internal web apps. Adobe Flash is fairly old and legacy technology, deprecated in all but name for that scenario and includes a fairly significant amount of lock in due to its proprietary nature requiring you to rework your entire application were you to change to an alternatire. Sounds familiar?


There is no such thing as "unnecessary" lock-in, it's just how the service is. Take it or leave it. Just because you think it's unnecessary doesn't change how that service is offered. And so what if there is work to do? Again, that is what "lock-in" is by definition = the work involved to move away.

You weigh the risks and see if the potential work (that might not ever need to be done) is worth the benefits offered today by that service. That's what an informed business does.


But picking at that a bit, if I were to run a simple off-the-shelf Django setup against MySQL, I'd use GAE (if not Standard than certainly Flexible). My code wouldn't be any different than deploying it to a raw VM, so I'm not locked in and could deploy it on DO, AWS or anywhere else in a heartbeat. In the meantime, I don't have to deal with setting up load balancing, autoscaling, DNS per module, etc. so I get some real value from that.

Is your concern the lock-in from the code standpoint, or operational "lock-in" that once you've got all this set up for you, you'd have to go replicate it to get out? If the latter, aren't you saying you're going to do that on Day 1 in the "avoid PaaS" case?


Why would you use GAE in this scenario, over, say, GCE or even AWS? And at a smaller scale, DO would be far cheaper.


> (...) non-lock-in alternatives (...)

There is an opensource reimplementation of App Engine's interface. That said, I have no clue about its completeness or buglessness.

https://github.com/AppScale/appscale


> App Engine is a joke. It was an interesting and novel idea at the time but it is not something you want to pick up today [...] in a world where there's so many better, cheaper, non-lock-in alternatives, even within Google itself.

As I see it, at the time App Engine was introduced, there were a lot more comparable PaaS offerings (maybe not with the same languages support, but the style of PaaS that GAE is was more common.) Competition from GAE and Heroku (which originally was focused on a similar PaaS offering) seems to have driven most of the other alternatives to pivot to something else or fail entirely.

Right now, there's very few close substitutes; there's lots of "alternatives" that require more infrastructure work (e.g., IaaS and similar offerings like GCE, GKE, EC2, etc.) or are off in the other direction ("severless" function hosting like Lambda or GCF.) And in some cases these might be better options than an GAE-style PaaS, but they aren't really direct substitutes, and they leave a space better served by GAE.


Wait, how? I've used it before for one-off web projects, but didn't think it was that bad.


Having inherited a large legacy codebase running on App Engine and powering a successful business, I see the biggest danger with App Engine as being the lock-in. We're on App Engine and we're not going anywhere, because it'd be too big of a project to move somewhere else. There's one open source project I know of that lets you run an App Engine project on another server (Appscale), but it's hard to know how useable it would be without the overhead of getting everything set up, and surely there are plenty of dark corners you'd encounter along the way.

Since App Engine comes with a proprietary data store, ORM and task queue built in, skipping the Appscale compatibility thing and just porting a large application to run on another platform would be a herculean effort. The data store models and everywhere in the application code that they're queried would need to be re-written for a different ORM, all of the data would have to be migrated to a different store, and all the background tasks and methods that trigger them would need to be rebuilt. This would be thousands and thousands of lines of hand-edited code changes, before even thinking about how much time would need to be spent verifying all of the changes.

At least with Heroku the magic is mostly around deploying and scaling the various pieces, rather than providing a lot of application level libraries that lock you into the platform. Porting to another platform likely would require a bunch of configuration management and some deployment scripts, but fairly limited changes to the application code itself.


As mentioned before, there is AppScale which will let you avoid the lock-in if you so desire. AppScale has been around for quite a while, and it has been sponsored by Google, but it is an independent company and an open source project. AppScale has been running customers application reliably for few years, with Datastore load of up to a quarter million of transactions (and this is an old white paper).

AppScale will allow you to move your application (unchanged) and still reap the benefits of the App Engine model, autoscaling and all. I still believe App Engine benefits trumps any other PaaS out there.


The easiest path off Heroku to self-hosted is probably Cloud Foundry: one of the inspirations was Heroku and several of the Cloud Foundry buildpacks are soft forks of Heroku's buildpacks.

Generally we find that any buildpack written for Heroku will work with Cloud Foundry without modification. With a little extra engineering you can create a buildpack that will run in a fully disconnected environment.

Disclosure: I work on the Cloud Foundry Buildpacks team on behalf of Pivotal.


I can't agree more!


Care to share these alternatives ?


Um, AWS? :)

Within Google, either of the GCEs (https://cloud.google.com/compute/ https://cloud.google.com/container-engine/) depending on your architecture.


Those aren't really alternatives to GAE IMO unless you would also say that a laptop is an alternative. A real GAE alternative would be a hosted solution that doesn't require you to install an OS or even know what OS it's using.


Such alternatives would have the same massive downsides as GAE (outdated software by design, lock-in by design).

I'm not saying "Don't use GAE because it's expensive/google/whatever". I'm saying "Don't use GAE or anything like it".


I think both AWS and Azure have pretty strong PaaS services where you just deploy your [nodejs|python|php] app. Heroku also comes to mind...


AWS has Elastic Beanstalk. Azure is still finding their own PaaS pathway.

Cloud Foundry gives you the "just push" experience for both of these (as well as vSphere, OpenStack, GCP and more to come). It's open source, with the IP owned by an independent foundation.

Disclosure: I work for Pivotal, we donate the majority of engineering on Cloud Foundry.


GKE/Kubernetes does fill that requirement; its not as simple as GAE, but its one step closer to bare metal.


As a nod to Kubernetes, we say GKE for Container Engine. Maybe we should add that acronym on that landing page...


Agree WRT network performance, the basic instances are unreasonably slow (only 70 Mbps for a micro), and requiring users to pay for a big 8 core server for wirespeed Gbe, and a huge 40 core beast for 10Gbe (and rather poor 10gbe at that) is totally unreasonable, and borders on price gouging (if such a term can be used for this market sector). The cross-AZ network performance is terrible much of the time, the jitter and loss are unpredictable and often far higher than you'd expect and generally trying to run high performance network services on AWS is deeply unpleasant in multiple directions at once.

WRT to VPC and security groups, I wonder if the OP might be talking about some of the surprising and frustrating limitations with them. E.g. VPCs are limited to fairly small subnet blocks, within which you have virtually no control. You can specify simple static routes - but only for subnets outside your VPC block. If you want to run a routing protocol in your VPC (e.g. you want your docker containers to be reachable via BGP), you get stuck pretty quickly - if the BGP subnet is within the VPC block, you can't route to them within the VPC, as the router won't permit more specific routes within the BGP block. If it's not within the VPC block, you won't be able to get to it via VPNs, peering connections, etc. So you're stuck either way.

Security groups are also limited - the number of rules you can have is pretty small, and they are a weird mix of stateless and stateful - they are stateful if the src/dest is an individual IP or group, but stateless for 0.0.0.0/0. And As you can't match TCP states like related or established, properly securing services when you have rules including 0.0.0.0/0 can be surprisingly difficult to achieve.

These are but a few examples of the limitations of the ways in which networking and security are implemented on AWS - there are plenty plenty more.


> Security groups are also limited - the number of rules you can have is pretty small

You can have multiple security groups now, though. Previously you were limited to just the one.


4. In terms of VPC and networking. I am definitely not satisfied with AWS.

- On GCE the security groups are created automatically and managed by the "role" of instances.

On AWS there is no link between instances and security groups. I currently have to emulate the working of GCE over AWS with extremely complex ansible scripts just so that an instance called "repository" can actually be assigned to a group called "repository".

- ELB (load balancer on AWS) cannot have a fixed IP (we've been waiting for that for YEARS). An ELB can only be accessed over a DNS name, and the underlying IP can change every 60 second. (Have fun with applications which are caching DNS).

By comparison, GCE has had load balancers with fixed IPs forever.

- AWS is a region nightmare. Many resources only exist in one region and cannot be accessed or even acknowledged to exist from another region.

e.g. an AMI only exists in a single region. It's not possible to create a host with it in another region. It's not even possible to take that existing AMI and push it to another region. (As far as a region is concerned, other regions don't exist).

All services on AWS are acutely region centric. I am currently expanding my infra to multiple regions and there are many obstacles to overcome. The networking and interconnections is definitely one of them.

By comparison Google is not region centric like that. It's a lot easier to manage at planet scale.

---

Just three major pain points on the top of my head.


Can you use VPC to create an environment that blocks outside access to Internet users completely, but connects to a primary hosting environment hosted elsewhere so that systems operating in both locations can communicate freely without concern for the AWS environment being exploited to access your primary server farm?


This is one of the main reasons we created Wormhole Network[1]. You can block all incoming traffic to your VPC and still be able to freely communicate with your resources in other locations.

The good thing is that you can easily add and remove devices / users to your network. We require an agent to be deployed, which is SoftEther's VPN client (free, open source, known VPN client).

[1] https://wormhole.network


Yes, you can have VPN endpoints to your VPC, where the other end is in whatever external network you want.

And if you don't want any external communication at all (barring your VPN), just remove the 'IGW' from your VPC (or make a new VPC without one). Or modify the VPC's routing tables. Or don't assign any public IPs to anything in the VPC. Or probably a few other methods :)


The argument in the original article against VPC sounded more like "it's confusing, so it's not worthwhile." VPC is confusing if you don't understand basic networking, but most people can pick it up, and people coming from a physical datacenter, moving to the cloud, will get it almost immediately.


> looks like a tool from the 90s

I see this criticism a lot for various things... and the things in question never look like the tools and/or websites I saw in the 90s. Instead, they look like things we'd dream of having in the 90s.


Word of caution - we've observed 2 global (all regions down concurrently) networking outages on GCE this year. 18 minutes on April 11 and 1.5-1.7 hours last night (except for us-west1 - down only 10 minutes):

https://status.cloud.google.com/incident/compute/16015 https://status.cloud.google.com/incident/compute/16007 https://cloudharmony.com/status-for-google


Wait: seriously, a 1.5 hour multi-region GCE outage just last night? That would be huge, how is it that we haven't heard about this? The linked /16015 status report doesn't have much information.


The incident report isn't done yet, so please be patient as the team finalizes the root cause and next steps. However, it didn't affect all routes, VMs, etc. and you'll see that when they publish the IR next week.


FYI - we verified outages from hundreds of last mile routes using Ripe Atlas probes with failure rates in the range of 86-96%.


"how is it that we haven't heard about this?"

Not enough customers to complain / notice.

Gartner report in the last week or two pointed out that Amazon is way in the lead with more cloud capacity than all the other providers combined.

The top two clouds are AWS and Azure. When either of those have major incidents you hear about it all over twitter etc. because that's a large user base that gets impacted. Google is in third place (according to them), but dramatically far behind. They do praise Google's big data tools, but point out even then people use AWS for most stuff and Google just for their big data bits.


I think partially due to the timing - 12-2AM PT vs 6-7PM on April 11. Google hasn't provided specifics - but our VMs in every region were 100% inaccessible during that period.


Wouldn't the phrase be "down simultaneously"?


I think concurrent is better in this context because the outage periods overlapped but were not synchronized.


Yes, yes, yes, I am also a GCP cheerleader. It's so easy and intuitive to use. A load balancer is called a load balancer, an instance is called a compute instance, SQL is called SQL etc.., no AWS name deciphering. GCP's sidebar UI navigation is a dream compared to AWS's mind numbing grid of features. I had a fault tolerant highly scalable system up in hours, and I'm not a dev ops guy, I mostly do frontend.


For me, it is primarily about Postgres. If Google Cloud had an equivalent offering to Postgres RDS then I would seriously consider it. I think RDS is one area where AWS is significantly better than many of their competitors. You can use any of the major relational databases and they handle backup, failover, read slaves, encryption at rest, etc.


We hear you (and all of you downthread) and we naturally happen to build what people ask for. Thanks for comments like these though, the PMs here literally link to HN all the time (if you want to help more you can ping them with "and my AWS bill was $XXk this month").

Disclosure: I work on Google Cloud, so I'll probably come back to you and hold you to this statement...


I'm mostly glad to hear that you have product managers. A lot of what comes from Google just seems to be engineering wandering randomly in and out of the latest shiny thing.


Ditto. If Google offered an equivalent of RDS PostgreSQL, I'd switch tomorrow.


Even if it were in Alpha / Beta? What if it didn't have point-in-time recovery, etc.? Just looking for when we can pencil you in ;)


Dear GCP Product Managers are you listening?! Please add this :)


Yes. We are listening.

Disclosure: I work on GCP


Same. We just switched to Amazon mostly due to Postgres RDS (and $100k in credits, but that's another story). When Google will get their Postgres game on, we'll switch for sure.


Me too!


Another vote for PostgreSQL. I've recently completed an AWS migration project, am about to start another, and really wish that I could have pitched GCP. The two reasons that I didn't were the "Beta" tags on things (in particular, App Engine Flexible Environment), and the situation with hosted SQL (1st gen vs. 2nd gen, both MySQL-only).


Same.


I wish compose.io would flush out their product line on DO, GCE and Azure... If you're mainly wanting PostgreSQL, database labs will support custom deployments, should be able to get it on GCE.

Along similar lines, I really find Azure SQL tempting to use as well, may test the latency from DO's SFO location to Azure's West US.

https://www.databaselabs.io/


First I hear of databaselabs.io but looks interesting. I wonder why they don't seem to offer HA option (at least didn't see it mentioned).


Probably a matter of their flushing out their tooling... not sure, in my case the continuous backups would be enough for my personal projects.


Same here, biggest reason to be in AWS is Postgres RDS which works like a charm!


Yet, RDS can't scale up much. You are stuck with a fixed number of HA instances, and a fixed number of read replica. You want a cluster of twenty nodes, roll your own.


Did you ask them to increase the limit? They usually increase the limit (regardless of service) if you ask. The current limit is 40 instances and 5 read replicas per master.

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_L...


This is one area I think Azure does shine... Azure's storage tables, blob and queues are dead simple and inexpensive to work with, and Azure SQL is priced better than any cloud db competitor.

Keeping hobby projects under $10-20/month is sometimes hard...


This is the reason that keeps me from trying Google Cloud as well. Not to mention a lot of 3rd party services aren't on Google Cloud yet either.


GCE has some major limitations. They just introduced static IP addresses 2 months ago..and I had to re-create all my servers to make use of the static IP.

Then, you either assign a public IP to each machine, or build yourself a solution...luke a NAT gateway. Google does not offer a service like this.

The machine that are part of the same subnet cannot talk to each other directly.. they have to go via the google gateway. This reduces the posibility of implementing High Avalability solutions that are based on broadcast or Virtual IP (MS SQL, Windows failover, vrrp...etc).

Labels that you apply to VM's or firewall rules are free text without the possibility to select from a labels list/cloud or type a new one...accidents can happen.

The MySQL solution is only reachable via INTERNET!!!. They seems to have a beta for using a local socket if you install a sql proxy on each server where you need sql db access...but not in production.

No support for MS SQL.


You'll note that AWS Nat gateway are only available since less than 6 months.

Any company who started before had to implement their own NAT gateway (we did). And with the shitty networking (150 Mbps top on large instances), that was the bottleneck for the whole subnet/VPC.

GCE has a lot better networking and 1Gbps on their instances which may them way less sucky than AWS for anything routing related.

Anyway. I migrated part of our cloud infra (one hundred instances, few subnets) to the new NAT gateway recently. It's working well so far but a few notes come to mind.

You gotta use the latest terraform (or cloudformation, tip: TERRAFORM IS BETTER). Eventually, play with the Release Candidates if you started early.

Ansible doesn't support NAT Gateway at all. (And probably that puppet/chef/salt do not either.) I doubt they will before next year.

Just my 2 cents about NAT Gateway. Let's not be too quick to judge google when AWS just released the feature a few months ago and it's only being available in tools about now.


Another odd limitation: No support for internal load balancers; all LBs must be public. They have it as an alpha release available on request, so it's coming, but it's a serious deficiency right now.

The lack of a hosted VPN solution is also weird.


AWS ELB limitations: No static IP. Only accessible through a domain name like "elb-role-12345.eu-west.aws.amazon.com" and the underlying IPs can change every 60 seconds.

The feature has been missing for years, it doesn't have any workaround, it's not being worked on anytime soon.

AWS vs GCE: Pick your limitations ;)


You don't have to tell us, internal LB was a long time coming. As for hosted VPN, do you mean something different than our Cloud VPN offering? ( https://cloud.google.com/compute/docs/vpn/overview)


By VPN I mean something that can be used individually by developers from their own machine (is there a proper technical term for this type?), not static point to point from a site such as an office (which is what CloudVPN seems made for).

We ended up setting up OpenVPN, which shouldn't be necessary. SSH is great and all, but when you add Kubernetes and other sensitive things like dashboards, you really want to virtualize everything and hide every internal service behind a proper, solid VPN gateway.


I think you're saying you want something like our BeyondCorp setup: moving from the office to a coffee shop doesn't change that you're still the authenticated user that we trust, you're just on a sketchy network now.

That's feasible but only things that do certificate exchange (https etc is easy enough). For what it's worth, part of the reason for Cloud Shell (the little shell within the web console) is so that you can interact with your infrastructure from inside the GCP network. That doesn't solve all your needs (like dashboards or deploying code from your laptop) but is often quite handy.


>is there a proper technical term for this type?

Generally individual users connecting to a VPN are called road warriors, even if they're not on the road and always connect from the same place.


Not type of developer, I mean type of VPN. :-)


Yes, that's what I mean. It is called a roadwarrior VPN. As opposed to a point to point VPN.


Ah, okay. I've never heard that term before, except about people.


To the point on MS-SQL, tbh, if you need MS-SQL, Azure for SQL proper, or Azure SQL will both be far less expensive than on any other cloud platform, offsetting any pricing difference for your application VMs even.

I'll be testing some latency between DO and Azure just so I can use some of Azure's services while keeping my applications on DO. Also considering Database Labs (databaselabs.io) for PostgreSQL on DO. Though I do like some of Azure's other offerings... Azure Tables are pretty nice if your needs fit the paradigm, even wrote some node.js wrappers to make access a lot easier a couple years ago, if I go with Azure will likely update them.


That was so much handwaving in an article that if you put your palm in front of the screen you'd hear clapping.


"Moreover, people you invite to projects must be Gmail or Google App users"

Potentially a deal-breaker for us. We had a LOT of issues accessing various Google services after switching from Google Apps to Office 365, which is cheaper and includes the office suite which the rest of the world use.


Google PM here.

That's not exactly right. You need a Google account, but it does not need to have gmail attached to it. Yes, lots of people don't know such a thing exists...but it's easy to set up.

On the "standard" Google account creation page is a link that says something like "I prefer to use my current email address."

As long as you have an email address that Google can validate, it's fine to use that instead of a gmail address.


Just create accounts for logging in into GCP? They don't have to really use gmail or be logged in into Google all the time. They could use a separate browser for GCP. I'm doing that when I have to use Google Drive.


Just the linkage into a Google account is still a liability: last time I tried to register with an employer email address, the process failed and further registration attempts with that email address were blocked. And there was no number to call.


I'm slowly moving over some my companies stuff from AWS to GCE/GKE. The platform may not be as mature but it feels a whole hell of a lot more well thought out and put together. Also hosted Kubernetes is a big deal for us since we really don't want to manage that ourselves but would prefer to use an open source offering for containers compared to ECS.


Glad to hear it!


I've tried both AWS and GCP (as well as several smaller VPS providers), and I definitely preferred working with GCP. AWS has plenty of features, but it all gets bewildering at times, whereas GCP seems to have a much more nicely presented UI.

But I have $15,000 in AWS credits from Stripe's Atlas program, and that was enough to make me reluctantly migrate from GCP to AWS. It means I can do things the "right way" instead of the "cheap way".

It's unfortunate Google couldn't offer something similar.


I thought the opposite. Google's UI is bewildering, nothing makes logical sense with how its put together. Disjointed pieces. Poor documentation. Incomplete API samples that didn't compile even for its own language (golang)


Agreed. And this isn't the first Google interface that I find completely unintuitive. My first foray into the Google docs was trying to get oauth working. The developer console is confusing and not laid out well. Documentation can sometimes point you in one direction, only to realize you're about to implement something that will be end-of-life in months.

AWS Console is not perfect, but IMO, it's much more clear where I need to go to find things, even without reading a single piece of documentation. And I've rarely had issues with AWS documentation not being clear and correct.


I only use the Google Apps for Business from Teh Goog, and while it's nice, the documentation is hard to navigate and not so hot. I expect a little bit of bitrot from such large product lines in the documentation so I understand a bit of staleness, but finding the right documentation and then trying to skim through it... it's frustrating and poorly laid-out in Google, in my opinion.

Maybe GCE docs are better, though.


I think they all tend to be a little like that... I'm thinking compose.io is in a good position to offer dbaas on the various cloud solutions, docker cloud for app management...

Could be the best of both worlds there... gitlab's work for CI/CD is interesting as well.


Agreed. I was moving my personal VPS from Linode earlier this year (after way too many security incidents from them). My first port of call was GCE, but in the end the UI just annoyed me. It took a while to wrap my head around and then was still confusing in various ways. I'm not going to write a tool to interact with an API for a once in a blue moon thing like "give me a server". I looked at AWS and Azure as well, and their consoles aren't much better (I think I preferred Azure's one, if anything)

In the end I went with Digital Ocean and had a server up and running in barely a few minutes, much quicker and easier.


Depending on your situation, we offer up to $100k in credits for startups (https://cloud.google.com/startups)


Sort something out with Stripe Atlas!


Reads more like an advert for a buffet than a reasoned take on the architectural and business tradeoffs. Detailed price breakdowns are difficult to read? That's your argument? Really? "By developers, for developers!" so original. [0]

I expect more effort to go into this from a technical perspective. Give me some workload / price differentials. Examples of architectures that are simpler on GCP then AWS.

My app relies on a hybrid of Firebase and AWS for the heavy assets. Still trying to justify the move to Google Firebase for the full stack. I'm the only dev so it's time and energy I can't spend on frontend iteration helping real users.

0. http://pbskids.org/zoom/


In a side thread, the quizlet folks mention their real-world experience and reasoning for coming to GCP. In your case, as a Firebase customer, the integration is already there and you could start adding pieces over time. As the integration between Cloud and Firebase improves, I'm guessing you'll find more reason to "just do it on Google Cloud". Ping the Firebase people about their current Alpha programs ;).

Disclosure: I work on Google Cloud.


Good. We're finally seeing people migrating to the better platform =)

This article is addressing that most of the GCE services are simpler and a lot easier to use and manage than the AWS counterpartS. (That final S being important, having many small and unpolished pieces just contribute to the complexity.)

It goes hand in hand with what I had already written about the platform. Expect GCE to be 20-50% cheaper and 1-3x faster:

https://thehftguy.wordpress.com/2016/06/15/gce-vs-aws-in-201...

https://thehftguy.wordpress.com/2016/06/22/a-simple-cost-com...

When you see that GCE has less market revenues, you should keep in mind that GCE could be half the price of AWS in average (yep, no kidding). That means the actual gap in customers is way smaller than it appears at first sight ;)

With that being said. Both platforms are relatively new and spinning services like mad, they both have important features missing (albeit not the same ones.)


> Amazon’s awesome, but Google Cloud is built by developers, for developers, and you see it right away.

Isn't that what AWS was also built for initially? For internal use by developers? I see this statement from time to time in articles like this.


It is a marketing material, not an article. Very thin on details with every point somehow being in GCE's favor? There is nothing AWS does better or they liked better?


AWS is built by monkeys with humanities degree.


so triggering. but mainly my fear in using GCP is this: Google is a product psycho, this is not their core business, I get the feeling that anything about GCP could change or disappear at any time for any reason and I as the customer is left to react to these changes.


Google has been putting a lot of time and effort into cloud in the recent years and doesn't look to be slowing down anytime soon.

As to ease your worry here a little, any launched cloud products have at a minimum a 1-year deprecation policy[0]. Anecdotal: AppEngine deprecated their Master/Slave datastore[1] in April 2012, and it was actually shutdown on July 6, 2015. So that's 3+ years to move to newer tech.

[0] https://cloud.google.com/terms/ (section 7.2)

[1] http://googleappengine.blogspot.com/2012/04/masterslave-data...


Here's the thing: 12 months deprecation notice is abysmal for a key business demographic, enterprise customers. That's lightning fast for many of them. Planning and moving that much infrastructure, data etc. is non-trivial takes a lot of time and hard work.

You point out examples of longer deprecation, but that doesn't matter one iota to people evaluating the cloud. Track record isn't legally binding, unlike the contractual 12 months.


Their record is 4yrs for ClientLogin:

https://cloud.google.com/appengine/docs/deprecations/


I agree that the OP is terrible, but it looks like GCP has quarterly revenue close to a billion [1], while AWS is at 2.6 billion [2]. That doesn't seem like such a huge gap that one of them is substantially more invested in it.

[1]: http://www.networkworld.com/article/3029164/cloud-computing/...

[2]: http://www.recode.net/2016/4/28/11586526/aws-cloud-revenue-g...


Not even close, that includes Google apps for work and other SaaS services.


Yeah I totally get you. But my experience with GCP since its inception (I'm an early adopter) is different than the rest of Google. The only "nasty" I've encountered like this was the deprecation of the Master/Slave Replication Datastore to the newer High Replication Datastore, which, to be fair, they gave me almost a year to remedy before one of my app instances became inaccessible. I blame myself, but it was still painful.


> they gave me almost a year

3. They gave you more than 3 years :)

https://cloud.google.com/appengine/docs/deprecations/


I stand corrected!


> I get the feeling that anything about GCP could change or disappear at any time for any reason and I as the customer is left to react to these changes.

Google is very committed to GCP and you see it in their work. Much like Amazon, they have a huge sum of computing capacity they aren't always using. It's an obvious business play.

But what's more, it's become the basis of differentiating their mobile platform. Google is trying to develop and sell-for-cheap the tools to make really good mobile apps (e.g., Firebase) to try and shore up the app ecosystem of their platform.

Given that a lot of the things they offer are 3rd gen variants of their core tech for developing things internally, it seems fairly reasonable to assume they're in this for the long haul.


It's a reasonable assumption, but still an assumption and a risk to take on. There are really compelling things about GCP not even mentioned in the article but I'm not sure if Google's got enough skin in the game to warrant me predicating my infrastructure on it without deep pricing incentives.

Digital Ocean and AWS are dedicated to continuing this line of business and keeping those customers addicted. Google is conducting another large and expensive moon shot which fine since it's part of their DNA.


They've been in the business for 8 years with steady progress. What's more, their container-based solution is also an open source platform that runs competently on Azure and AWS.

I'm not sure you can say it is an expensive moonshot at this point.

I'm not using them, myself, but I wouldn't consider that aspect of the risk substantial given the fact that I could erect identical infrastructure on AWS in the worst case scenario. And I have had to plan exit scenarios from AWS because ultimately they're a single vendor with arbitrary control over their platform, as well.


Not enough skin in the game? They're bringing an additional 10 datacenters online in 2017 and they just opened an additional 2 in 2016. Building that many data centers costs Billions. If that's not having "skin in the game" I don't know what is.


It hasn't been their core business historically, but all they're doing is commercializing stuff that's been in internal use for many years. And they've publicly been making a VERY hard push into the cloud business... they aren't going anywhere.


They hired Ray Kurzweil too, I'm sure they're trying to commercialize the singularity.

GOOG/Alphabet as a company is going to be around, but will this line of business be with this set of offerings?


> Google is a product psycho, this is not their core business

So you're advocating we should trust the book vendor?


This isn't a fair criticism. Amazon is in the Cloud Services business and has a retail business on the side. Look at where their profits are: 66% of Amazon's profits last quarter are from its cloud biz.


It's not a matter of "trust" - I don't trust the plumber to fix my toilet more than the rando I can find on Craigslist per se, but I do have the expectation that he intends to continue plumber-ing and will be around to stand by his work in three months time.


AWS is the cash cow of amazon for now. They get their safest margins there and revenue to match.


Yes, exactly.

That's why GCE should be as important to Google as AWS is to Amazon while arguably both have a different "core business".


After ten years of AWS, and with the vast operating income AWS is now providing Amazon - yep.


The public cloud space is going to be ginormagantuan. It's just too big to ignore. GCP is Google's Xbox: I expect they'll throw money at it for a decade, if they have to, because it's too fantastic a market to ignore.

AWS is there through first-to-market leadership.

Microsoft is there because they've got a strong sense of what every SME and Enterprise customer in the world wants.

Google are there because they have deep expertise in building these kinds of systems and it would be utterly bonkers not to try and diversify into this field.



The same way TodoMVC [1] has standardized Javascript framework 'Hello World' comparisons, someone should make an up-to-date CloudApp matrix which shows a basic cloud benchmark app in a matrix comparison for Amazon, Google, DigitalOcean, Vultr, etc. so users can see basic price/bandwidth/performance.

There's so much Enterprise Buzzword Bingo that Amazon and Google try to confuse you on as to why there service is magically better, but IMO the biggest selling points for startups are price/bandwidth/performance. And the prices and machine configurations change all the time.

[1] http://todomvc.com/


"Amazon’s awesome, but Google Cloud is built by developers, for developers, and you see it right away."

AWS is built by developers for developers as well.

Most of the complaints here about AWS aren't actually about AWS not being "for developers", but about AWS requiring a certain learning curve.

It's a perfect trade-off between power and flexibility vs agility.


I think that it's fairer to say that it's built by enterprise technical people for enterprise technical people. Things like the fine-grained control over networking and resource permissions are hallmarks of enterprise tech.

That's not to say that these things don't solve real problems, but they are the problems of big organizations. Smaller teams building pure-cloud products don't have the same problems, and may not even have people with Big Corp experience.

AWS is the Java of cloud platforms.


What software targetting developers is not "built by developers for developers". Every language, IDE, compiler, etc etc


"Moreover, people you invite to projects must be Gmail or Google App users" - I'm trying to figure out why walled garden lock-in is being treated as a feature.

It sounds like a lot of his reason for switching is simplicity. From this article, it sounds like AWS has a lot more options and control, and GAE just does what it wants or thinks is best. I suppose to some, the latter may be appealing, but at least the author is honest with a "your mileage may vary". Willing to bet this simplicity costs a lot of options other companies may wish to use.


You moved because you did not understand how to run AWS wisely.


Most people don't have time for that. Just give me kubernetes or GAE.

I'm convinced Google's infrastructure play will ultimately win the hearts and minds of developers because they understand this.


There's an implication that AWS requires more effort to understand how to run it wisely than google's offering. Depends on lots of factors, obviously.


You should always understand the underlying fundamentals of something you're using. If not, be prepared for the eventual pain.


I'm sorry, but this is not a sustainable strategy for modern society. Consumers, entrepreneurs, corporations, they all interface with technology and social constructs they do not fully understand as part of daily life.


And enough of them get burned by that lack of knowledge. Some may be comfortable with that lottery, others are not.

For most, it depends on the risk of an issue (mostly measured ad-hoc by how many times you hear of others having an issue) vs the potential pain one would feel if an issue arises.

For a business's core infrastructure, I've heard enough so that the risk is high enough, when taken together with the potential loss of everything, that I make it a point to spend time and effort to understand as much as I can about what I rely on.

For walking down the sidewalk, no one is suggesting that you understand the fundamentals of concrete curing. The risk of something happening to you must be practically zero, and the potential harm if something does also seems quite low.

Of course, there's always going to be a success story in the midst of the crowd where someone flew by the seat of their pants and everything worked out.

I happen to think that's a minority, and I also don't feel comfortable gambling with my business like that. But to each their own.


Sure, but isn't that a selling point of GCP?

If GCP can bring 80% of the benefits of AWS at a reasonable price for shops without the requirement of deep AWS & devops expertise, isn't that a really excellent place to occupy?

People here often seem to have an implicit assumption that AWS's complexity is either inevitable or inherently of value. I'd point to Azure as another alternative rejection of that viewpoint that serves a wider audience of technical skill levels than GCP. And honestly, I really like working with Azure.

I've built a few businesses on AWS now, and I know and trust it. But it has a ton of infuriating features and many many things they've promised have been delayed years. I'm not opposed to a new CSP at all. Why would anyone be?


The problem with a solution that covers 80% of your use case is that when you need something out of that remaining 20%, you either need to build it yourself or migrate.

AWS IAM policies are a good example - yes they are complicated and hard to navigate, but when you need to do complicated grants of access to your resources, then you'll appreciate the complication.


That assumes that that you will need to do complicated things, which I suspect depends on your organization.

If you are part of a larger business with an established system administration function, then the flexibility of AWS is probably appealing, because you can transpose your existing set-ups and practices on to it.

For smaller organizations, particularly new "cloud-native" ones that may have never owned a server and might not have big administration teams, AWS looks like a bad fit.

Learning AWS is now more complicated and a bigger time investment than any technology that I've encountered for years. We operations people are used to doing a lot of learning up-front for gnarly and sometimes old tech, but it's not how developers engage with new platforms today.

If GCP can deliver the 80% without the huge time investment that mastering AWS now requires then over time, it's going to win the green-field projects.


I have done in-depth work with AWS, and the OP is correct. The problems are with AWS, not the victim.


Laziness can be a virtue :) In this case, it safes you from AWS-lock in. I'm happy to use a managed database as long as it speaks SQL, but I'm not going to use a proprietary one for as long as I can avoid it.


We've been using AWS since nearly the day it was released. Overall I really like it, but I haven't dived deep into the GC platform so I cannot really give a definitive comparison.

A couple of things that baffle me about AWS though -

* You can purchase reserved instances from anywhere in the world in a few seconds, but WHY when I want to on sell an unused reserved instance in the marketplace, do I HAVE to have a US bank account for Amazon to pay my revenues into?? Why can they just not credit my existing account which I pay thousands per month to them?

* We still battle with occasional high latency spikes for EC2 instances talking to RDS instances within the same VPC! Continuous "MySQL server has gone away" log messages. Ironically, we have absolutely no trouble with Non-VPC EC2 instances that use classic bridging to talk to the same RDS server within the VPC!

We thought to simplify our architecture by putting everything in one VPC for better security and ease of maintenance, but had to go back to a cobbled half and half solution to maintain performance. Hence why we now have a few reserved 'VPC only' instances which we cannot recoup costs by onselling in the marketplace... :(


Raise a ticket; I see no issues VPC => RDS endpoint at all. Might be something gone sideways in a layer you can't see.


You can modify your RI's from EC2-VPC to EC2-Classic


I use AWS extensively and it has many benefits. That said, I think they need to rethink some thing if they want to stay ahead of Google, Digital Ocean, etc.

Some thoughts:

## More focus on core products

- Still no IPv6, it's 2016.

- EC2/VPC (compute) didn't have a simple NAT-instance until a few months ago.

- S3 (storage) still has nothing like Nearline (GCE) and Glacier is so confusing and complicated that few bother using it.

- ELB (Load Balancer) still has scaling problems (with bursts).

- Still only a single public key per instance, why?

- Can only use ACM certificates for CloudFront if they are in us-east-1 (despite ACM being available in other geographies and CloudFront is global). Nothing major but weird.

- S3 has like a handful of different ways of setting permissions (per file, per bucket, iam, and some weird old xml-format). Why not simplify this?

## Interfaces are terrible

- Basically the entire console UI is low quality.

- Same with APIs, they could use a lot more polish.

- This is the API call to create a CloudFront (CDN) distribution: http://pastie.org/10931494 (no offence, but it looks like a group of schizophrenic monkeys designed that API).

## Think less about press-releases

- Apparently AWS force everyone to write the press-release for every product before designing[1].

- While it's good to keep the end-goal in mind I think many important details are lost.

- The press-release-thinking focus too much on features and too little on quality and (even more important) refinements to existing things.

## Release fewer features

- Every year on ReInvent [AWS conference] they tout how many hundreds of new features they've released[2]. I wish they'd release far fewer features and make them great instead (+ refine existing stuff).

[1] http://www.allthingsdistributed.com/2006/11/working_backward...

[2] https://s3-eu-west-1.amazonaws.com/vpblogimg/2015/04/Aws-Sum...


> ## Interfaces are terrible

>- Basically the entire console UI is low quality.

I wouldn't bet on them fixing this. Amazon in general are not big on design and in particular they aren't about making things look good. I think in general they see it as a waste of time, something that can always be done later after they have beaten their competitors. Obviously there's a threshold for this sort of neglect but they seem pretty expert at riding it.

Frankly what you said about it is polite. The AWS web interface is horrendously ugly and just barely functions well enough to be used. It's a testament to how little they care about good design but then again their consumer-facing website is no peach either.


>- EC2/VPC (compute) didn't have a simple NAT-instance until a few months ago.

Managed NAT is new, but the nat instances that previously existing could be spun up without any configuration outside of disabling src/dst check and configuring a route to point to them, and they've existed for many years. http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_NA...

>- S3 (storage) still has nothing like Nearline (GCE) and Glacier is so confusing and complicated that few bother using it.

S3 Infrequently Accessed? https://aws.amazon.com/s3/storage-classes/


What gets me is that S3 is probably the easiest thing to learn about in AWS. I would understand if he complained about the million things you can do in EC2, but in S3 there is not much choice; S3, S3-IA, S3-RR. Throw in Glacier and there still are not much to get confused on.


I'm using S3 extensively and know the product pretty well.

Still, I think Nearline easily beats Glacier on simplicity and clarity. With Nearline it's super-obvious how long time retrievals take and what the costs are.

--------------

## Nearline

Q: How much does it cost?

1 cent per GB & month in storage. Plus 1 cent / gb of transfer (retrieval).

Q: How fast can I retrieve data?

Access times sub 1 second.

SOURCE: https://cloud.google.com/storage-nearline/

--------------

## AWS Glacier

Q: How will I be charged when retrieving large amounts of data from Amazon Glacier?

You can retrieve up to 5% of your average monthly storage, pro-rated daily, for free each month. For example, if on a given day you have 75 TB of data stored in Amazon Glacier, you can retrieve up to 128 GB of data for free that day (75 terabytes x 5% / 30 days = 128 GB, assuming it is a 30 day month). In this example, 128 GB is your daily free retrieval allowance. Each month, you are only charged a Retrieval Fee if you exceed your daily retrieval allowance. Let's now look at how this Retrieval Fee - which is based on your monthly peak billable retrieval rate - is calculated.

Let’s assume you are storing 75 TB of data and you would like to retrieve 140 GB. The amount you pay is determined by how fast you retrieve the data. For example, you can request all the data at once and pay $21.60, or retrieve it evenly over eight hours, and pay $10.80. If you further spread your retrievals evenly over 28 hours, your retrievals would be free because you would be retrieving less than 128 GB per day. You can lower your billable retrieval rate and therefore reduce or eliminate your retrieval fees by spreading out your retrievals over longer periods of time.

Below we review how to calculate Retrieval Fees if you stored 75 TB and retrieved 140 GB in 4 hours, 8 hours, and 28 hours respectively.

First we calculate your peak retrieval rate. Your peak hourly retrieval rate each month is equal to the greatest amount of data you retrieve in any hour over the course of the month. If you initiate several retrieval jobs in the same hour, these are added together to determine your hourly retrieval rate. We always assume that a retrieval job completes in 4 hours for the purpose of calculating your peak retrieval rate. In this case your peak rate is 140 GB/4 hours, which equals 35 GB per hour.

Then we calculate your peak billable retrieval rate by subtracting the amount of data you get for free from your peak rate. To calculate your free data we look at your daily allowance and divide it by the number of hours in the day that you retrieved data. So in this case your free data is 128 GB /4 hours or 32 GB free per hour. This makes your billable retrieval rate 35 GB/hour – 32 GB per hour which equals 3 GB per hour.

To calculate how much you pay for the month we multiply your peak billable retrieval rate (3 GB per hour) by the retrieval fee ($0.01/GB) by the number of hours in a month (720). So in this instance you pay 3 GB/Hour * $0.01 * 720 hours, which equals $21.60 to retrieve 140 GB in 3-5 hours.

First we calculate your peak retrieval rate. Again, for the purpose of calculating your retrieval fee, we always assume retrievals complete in 4 hours. If you request 70GB of data at a time with an interval of at least 4 hours, your peak retrieval rate would then be 70GB / 4 hours = 17.50 GB per hour. (This assumes that your retrievals start and end in the same day).

Then we calculate your peak billable retrieval rate by subtracting the amount of data you get for free from your peak rate. To calculate your free data we look at your daily allowance and divide it by the number of hours in the day that you retrieved data. So in this case your free data is 128 GB /8 hours or 16 GB free per hour. This makes your billable retrieval rate 17.5 GB/hour – 16 GB per hour which equals 1.5 GB/hour. To calculate how much you pay for the month we multiply your peak hourly billable retrieval rate (1.5 GB/hour) by the retrieval fee ($0.01/GB) by the number of hours in a month (720). So in this instance you pay 1.5 GB/hour x $0.01 x 720 hours, which equals $10.80 to retrieve 40 GB.

If you spread your retrievals over 28 hours, you would no longer exceed your daily free retrieval allowance and would therefore not be charged a Retrieval Fee.

Q: How is my storage charge calculated?

The volume of storage billed in a month is based on the average storage used throughout the month, measured in gigabyte-months (GB-Months). The size of each of your archives is calculated as the amount of data you upload plus an additional 32 kilobytes of data for indexing and metadata (e.g. your archive description). This extra data is necessary to identify and retrieve your archive. Here is an example of how to calculate your storage costs using US East (Northern Virginia) Region pricing:

Your storage is measured in “TimedStorage-ByteHrs,” which are added up at the end of the month to generate your monthly charges.

Q: How long does it take for jobs to complete?

Most jobs will take between 3 to 5 hours to complete.

SOURCE: https://aws.amazon.com/glacier/faqs/


Ha! This is a great "write up". I'll pass it along to our heads of storage to put on slides ;).

I'd like to add one piece I always forget: with S3-IA, you pay for a minimum of 128 KB, which for apps with tons of small objects really adds up!

Disclosure: I work on Google Cloud.


related industry review;

* http://siliconangle.com/blog/2016/08/05/aws-microsoft-azure-...

day to day i work with AWS, Azure, and Google quite a bit -- from what, in terms of investment, i would say gartner is accurate.

i'm not sure why digital ocean wasn't taken as serious.


Hi guys, I'm the original poster. Sorry for joining so late. I read each and every feedback and also updated the medium article. In no particular order:

1) Didn't mean to hand-wave. For the last few months GCP was the new shiny toy for me. I got over excited and plan to do deeper post, battling 2 or more products (i.e Kinesis vs. Pubsub)

2) Not anti-AWS at all. I still find it wonderful: https://lugassy.net/search?q=aws. what really flipped me was this: https://twitter.com/mluggy/status/727764607176159232 and other small anonyances

3) Downsides to GCP. Certainly! added to the article

4) For developers, by developers. I realize this was tacky. Let me try again with 4 examples: a) Trace/breakpoints in production b) Connecting through virtual socket files instead of tcp c) writing and collecting logs centrally with console.log and specifically d) DATAFLOW which is architecturally brilliant (AWS add "Streams" to every new product where GCP simply treat every product as source and/or sink)

5) Yes I favor GCP for simplicity and dev-friendliness. I don't want to train people for AWS devops/gotchas. I don't need sophisticated IAM and VPC to feel smart. GCP ui/quickbar is nicer. For example networking is all on one page. Most products are properly named (CDN instead of CloudFront, DNS instead of Route53, Pub/Sub instead of Kinesis)

6) GAE. I was exposed to some horror stories about its and datastore early days. History aside, I use it today through Flexible VMs which is really docker under the hood + ability to SSH + easy deployment/versioning + cool tracer/debugger. My code is Node.js and haven't coded any GAE specific necessity

Again loved the comments. keep it going!


We have several workloads on AWS and we are mostly happy with it in spite of having several gripes. Google does not have a region in Australia, preventing us from even considering it. If anyone from the GCE team is on here, are there any plans to come down under?


"Moreover, people you invite to projects must be Gmail or Google App users which are secure by default and usually already set up."

Uhh...no. Why would anyone assume some random Gmail account is secure by default?


I wish they wouldn't have called these a "Google Account", but you can use any email provider now to sign in/up with Google.

Disclosure: I work on Google Cloud, so if you sign up I indirectly ... take your money?


With google cloud sql and amazon aurora being mysql compatible do any of you get the temptation of starting your projects with mysql instead postgresql?


Google Cloud SQL runs outside of your network (on a public IP). Thus it doesn't respect your firewall rules and can't use internal network addresses. I run my own replicated compute engine mysql nodes for this reason. Amazon RDS and Aurora were superior DBAAS products when I was using AWS. Kubernetes on Google Container Engine is ultimately what convinced me to switch though.


Really want Google Cloud to be an option for us - but given that they lack a Sydney/Australian data centre means they miss out on a lot.


What would the equivalent of say a $20 node on Digital ocean cost on the Google Cloud Platform?


Between about $25 and $40 depending on how you configure the machine. GCP doesn't offer exactly the same configurations as DO.


Despite me working on GCE, I have to give a shout out to all the hosts like DO that include a hefty amount of network egress in the price. The network is heavily overcommitted (how many of you actually send X GiB/month/droplet?), less reliable, etc. but for those that use it to serve images for "don't care" it can't (yet) be beat.

Disclosure: I work on Compute Engine and Cloud and want your business.


This is a good tool for cost estimation - https://cloud.google.com/products/calculator/

Google Compute Engine instances will always be more expensive than DO/Linode because GCP offers so much more. GCE instances will only be cheaper than DO droplets if you will shut down the instances when you don't need them as you don't get charged for them when they are off unlike DO.


I'm in the same boat... prepping a handful of .Net apps to run under docker (dokku) and migrating from Azure host to DO... though I may well continue to use Azure's storage services, and Azure SQL... the pricing is a bit better than the alternatives.


The biggest problem with Google Cloud is data store lockin..which is worse than programming language lockin.

I would move to them in heartbeat if they had anything like postgres RDS ... But Google seems to love it's own version of MySQL.

Anybody know if this is on the roadmap ?


Why is it such a big deal, why not just run Postgres yourself?


Why can't I run my own servers then ? The whole point of AWS is moot then.

Between everything else, database is the single most critical aspect - and there's a huge value for a high availability system.


Does anyone have feedback running app at scale in Google Cloud for more than one year?


We've been running it for about a year to power Quizlet - overall things have been good and we're happy. AWS and GCP are complicated enough that they're tough to compare holistically, but on most of the things we care about we find GCP to be equivalent or better (sometimes significantly) than AWS. It really does have better networking and disk technology, and the pricing is much better. Here's the analysis we did: https://quizlet.com/blog/whats-the-best-cloud-probably-gcp.

Rough patches are:

- Live migration is sometimes not seamless.

- Pub/sub is missing some core features.


Can I ask what the Pub/sub features you miss are?


Mainly Pub/Sub does not let a single subscription subscribe to multiple topics


Unlike AWS, GAE is very easy to use.


Anyone run Ruby on GAE's new "Flexible Environment"?


Thought about it, but I believe it was the pricing model that made it rather unattractive compared to just running Compute Engines.


Did you also investigate moving to Heroku rather than GAE?


IAM is complex because it is powerful. Unfortunately.


> Want to run a message bus? AWS will make your head spin with SNS, SQS, Kinesis, Kinesis Streams and Kinesis Firehose. GCP has only Pub/Sub which just works and is insanely scalable.

This point is oddly weak in comparison to the others.

> Amazon has one of the most confusing IAM. While it is nice to set up a role to only allows usage for a particular resource from a specific device and times of day, you end up spending most of your time debugging policies.

Haven't used GCP, but I didn't mind IAM. Missed it when I waas trying to figure out Azure stuff.

> We moved because we wanted to work on infrastructure that runs YouTube, Gmail and Google Analytics. We moved because Google is fair, much more tech-savvy and launch products that just works.

ugh. Perfectly fine article now just reeks of google "fanboy-ism".


> This point is oddly weak in comparison to the others.

Agreed - it was senseless. SNS/SQS aren't particularly complex and they scale like crazy. Kinesis is something else entirely (N-hour record retention w/arbitrary consumers and offsets) and doesn't even belong in the comparison.

> ugh. Perfectly fine article now just reeks of google "fanboy-ism".

Do Google's core products actually run on GCP these days? I was under the impression they do not.


> Do Google's core products actually run on GCP these days? I was under the impression they do not.

Not really. They are influenced by, but the same, code as runs Google. (for instance, Kubernetes is based on Borg)


To add to your point:

> Google lets you set up simple Firewall rules. Amazon gives you VPC, security groups, network access control lists and a big, fat headache.

AWS also gives you firewall rules, in addition to the other options. If you don't get why you would need them, you don't really have to use them.


I found Google App Engine difficult to work with as recently as two years ago, just for a small proof-of-concept app. It took a lot of shoehorning to use third party Python libraries, the file upload API is just weird, and the admin panel was buggy.

Also, another random oddity: somewhere in one of its built in libraries there was a function for validating email addresses. It was returning true on a bunch of very obviously bad address formats, so I looked up the source and found that all the function did was verify its argument wasn't the empty string.

I imagine GAE is better now, but my first experience with it left a bad taste in my mouth.


> Also, another random oddity: somewhere in one of its built in libraries there was a function for validating email addresses. It was returning true on a bunch of very obviously bad address formats, so I looked up the source and found that all the function did was verify its argument wasn't the empty string.

My favourite comment of the whole thread.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: