Apologies if you find this to be in poor taste, but GCS directly supports the S3 XML API (including v4):
and has easy to use multi-regional support at a fraction of the cost of what it would take on AWS. I directly point my NAS box at home to GCS instead of S3 (sadly having to modify the little PHP client code to point it to storage.googleapis.com), and it works like a charm. Resumable uploads work differently between us, but honestly since we let you do up to 5TB per object, I haven't needed to bother yet.
Again, Disclosure: I work on Google Cloud (and we've had our own outages!).
Our production Cloud SQL started throwing errors that we could not write anything to the database. We have Gold support, so quickly created a ticket. While there was a quick reply, it took a total of 21+ hours of downtime to get the issue fixed. During the downtime, there is nothing you can do to speed this up - you're waiting helplessly. Because Cloud SQL is a hosted service, you can not connect to a shell or access any filesystem data directly - there is nothing you can do, other than wait for the Google engineers to resolve the problem.
When the Cloud SQL instance was up&running again, support confirmed that there is nothing you can do to prevent a filesystem crash, it "just happens". The workaround they offered is to have a failover set up, so it can take over in case of downtime. The worst part is that GCS refused to offer credit, as according to their SLA this is not considered downtime. The SLA  states: "with respect to Google Cloud SQL Second Generation: all connection requests to a Multi-zone Instance fail" - so as long as the SQL instance accepts incoming connections, there is no downtime. Your data can get lost, your database can be unusable, your whole system might be down: according to Google, this is no downtime.
TL;DR: make sure to check the SLA before moving critical stuff to GCS.
Sure you get downtime all the same but not the waiting for support to solve an instance crash part
We've had to use it and can confirm that it works as advertised.
It's not in bad taste, despite other comments saying otherwise. We need to recognize that competition is good, and Amazon isn't the answer to everything.
I think there is little GCP does better than AWS. Pricing is better on paper, but performance per buck seems to be on par. Stability is a lot worse on GCP, and I don't just mean service outages like this one (which they had their fair share) but also individual issues like instances slowing down or network acting up randomly. Also lack of service offerings like no PostgreSQL, functions never leaving alpha, no hosted redis clusters etc... Support is also too expensive compared to AWS.
Management interfaces are better on GCP and sustained use discount is a big step up against AWS reservations. Otherwise, I think AWS works better.
Just last week I got an email saying that they'd discovered an issue on Google Cloud Datastore where certain (strongly consistent!) queries could have been returning incorrect results for a week long period and that I should check my logs to see if anything important had been affected in my application.
That's not the sort of behaviour that inspires confidence in a service.
Most notably, I know many people who run these types of sites and outside of GAE being mediocre, I've never heard them complain about anything like that.
Other services are a different story - from my perspective Google are better at supporting legacy interfaces than most.
> We are writing to inform you that we are winding down sales and renewals of Google Site Search (GSS). Starting April 1st, 2017, new purchases and renewals of GSS will not be available.
Site Search seems like an infra offering to me.
Not an expert by any means but I would put more weight to Google's ONE year promise over (to give an example) HPE's twenty years promise. I know it is a cheap shot because I am pretty sure HPE will be bought and sold at least once in the next twenty years.
We were users of the Google Mini Search appliance, went to a 3rd party in-house installed search solution that we did not like and then a year ago went to GSS. We are looking again for something suitable. The best part of the Google Site Search was search fidelity.
I.e., some douchebag who has no interest or stake in what you do has just dumped a potentially substantial amount of technical debt into your product backlog and, quite possibly, prioritised it all the way to the top.
As somebody else noted above: I don't need people creating more work from me. I can do that quite well enough on my own, thanks very much, and for side-projects this kind of chopping and changing is a pain in the ass.
By definition, with side-projects time is limited, so you absolutely have to focus on the most valuable activities to the exclusion of all else. For this reason, I only consider AWS and Azure for my projects: Google are just too fickle. Lucky you, if you have the time to deal with their nonsense.
(Btw, I'm not dissing Google on a technical level - they obviously do great, interesting work, and they're certainly one of the pioneers of PaaS. I just don't need the hassle of having to fix stuff because they keep killing APIs, projects, services.)
It might not, but doing it so much for other services destroys trust across the entire brand.
This whole idea of being angry at a vendor for deprecating something with 1yr notice is just ridiculous!
People need to realize they are choosing lock-in, and are choosing the risk of deprecation every time they decide to use a cloud service with no drop in competition/open source/etc.
Own your choices people, don't blame others...
The expectation of stability beyond a year is certainly not unreasonable when you're asking people to build their businesses/infrastructure on your platform.
And, building redundancy across providers can be impractical, owed to learning curve, cost duplication, higher outbound bandwidth costs, effort duplication, solution complexity, etc.
Then, about a year or two ago - humans actually started responding to and fixing problems. A welcome change!
I used to work on the Azure Portal Team. As much negative things as I can say about Microsoft, they take making things just work for developers seriously, despite high prices and misc. service issues.
The since nixed compute container project I initially worked on really exemplified this.
I tend to use Colo or AWS when possible but I have a client that insisted on Google GCE and Endpoints.
I've spent so much time time digging through source code and working around broken dev tooling, and dealing with incorrect or out of date documentation thanks to that requirement.
In my personal opinion Google has a way to go in mature tooling. Silent failures, or worse failures that don't result in build failures are not acceptable. Requiring paid support contracts to resolve an issue in google infra is not acceptable. Incredibly poor support for local dev environments is not acceptable.
After dealing with this stuff, I find it unlikely that I will ever rely on their systems in the future. AWS/Colo or, with reservations, Azure all the way.
And good luck getting accurate documentation.
I suspect though that most people affected deemed the risks and costs of failure low enough to be acceptable, and for many people it still is - even with this outage. But that's a conscious decision, rather than plain ignorance.
Twice the persistence means always having at least one backup and thus the occurance of downtime reduces not up
Sounds like it basically coincides with Diane Greene coming on board to run the show -- which is great news for all of us with increased competition on not just the technical front but also support (which is often the deal maker/breaker)
I was at a talk last year, where she spoke, and as much as I love Google, it was one of the boat boring talks I've ever heard in my life. So monotone and uninteresting... and I'm probably one of the biggest Google fans out there.
Look at Safra Catz's public speaking (Oracle). Terrible public speaker, terrific operator .
 though we may easily disagree with their business practices.
Managing stateful services is still difficult but we are starting to see paths forward  and the community's velocity is remarkable.
K8s seems to be the wolf in sheep's clothing that will break AWS' virtual monopoly on IaaS.
 We (gravitational.com) help companies go "multi-region" or on-prem using Kubernetes as a portable run-time.
 Some interesting projects from this comment (https://news.ycombinator.com/item?id=13738916)
* Postgres automation for Kubernetes deployments https://github.com/sorintlab/stolon
* Automation for operating the Etcd cluster:https://github.com/coreos/etcd-operator
* Kubernetes-native deployment of Ceph: https://rook.io/
In addition to Rook, Minio  is also working to build an S3 alternative on top of Kubernetes, and the CNCF Landscape is a good way of tracking projects in the space .
Disclosure: I'm the executive director of CNCF, which hosts Kubernetes, and co-author of the landscape.
Anyway, one needs an on-ramp to containers on Google Cloud. And one can't open source the one that one has, which despite being nearly mature enough to own a driver's license, wouldn't really fulfill the precise need that Kubernetes fills without some frontend work. So one writes Kubernetes. An almost entirely different fundamental architecture, by the way, so it's interesting for those who've seen both to compare.
In other words, you're not entirely off the mark even with the generalization.
I remember reading somewhere in the K8s documentation that it is designed such that nodes in a single cluster should be as close as possible, like in the same AZ.
It took me about 15 minutes to spin up the instances on Google Cloud that archive these objects and upload them to Google Storage. While we didn't have access to any of our existing uploaded objects on S3 during the outage, I was able to mitigate not having the ability to store any future ongoing objects. (our workload is much more geared towards being very very write heavy for these objects)
It it turns out this cost leveraging architecture works quite well as a disaster recovery architecture.
Disclosure: I don't work for google but have an upcoming interview there.
Disclosure: I took a tour there one time and have used google.
EDIT: I realized that I was being mean, but why was that disclaimer relevant?
Also it could look suspicious if grandparent gets the job and at some point in the future someone looks back at this comment.
If in doubt, disclose. Especially in the tech industry, that's what Gamergate was actually about.
- transparency is always good
- adding a small disclosure to the bottom of a post is very low impact
- someone who is interviewing for a job at a company is likely to have a set of biases that influence what they say even if they think that they're being honest and objective.
I also want to personally thank Solomon (@boulos) for hooking me up with a Google Cloud NEXT conference pass. He is awesome!
I use CloudFlare. They handle generating a SSL certificate, can have a CNAME at the APEX, full-site static caching, 301 http => https redirects, etc.
Been trying to get one for IO (can't attend NEXT unfortunately)
There are a large number of people out there looking intently at ACD's "unlimited for $60/yr" and wondering what that really means.
I recently found https://redd.it/5s7q04 which links to https://i.imgur.com/kiI4kmp.png (small screenshot) showing a user hit 1PB (!!) on ACD (1 month ago). If I understand correctly, the (throwaway) data in question was slowly being uploaded as a capacity test. This has surprised a lot of people, and I've been seriously considering ACD as a result.
On the way to finding the above thread I also just discovered https://redd.it/5vdvnp, which details how Amazon doesn't publish transfer thresholds, their "please stop doing what you're doing" support emails are frighteningly vague, and how a user became unable to download their uploaded data because they didn't know what speed/time ratios to use. This sort of thing has happened heaps of times.
I also know a small group of Internet archivists that feed data to Archive.org. If I understand correctly, they snap up disk deals wherever they can find them, besides using LTO4 tapes, the disks attached to VPS instances, and a few ACD and GDrive accounts for interstitial storage and crawl processing, which everyone is afraid to push too hard so they don't break. One person mentioned that someone they knew hit a brick wall after exactly 100TB uploaded - ACD simply would not let this person upload any more. (I wonder if their upload speed made them hit this limit.) The archive group also let me know that ACD was better at storing lots of data, while GDrive was better at smaller amounts of data being shared a lot.
So, I'm curious. Bandwidth and storage are certainly finite resources, I'll readily acknowledge that. GDrive is obviously going to have data-vs-time transfer thresholds and upper storage limits. However, GSuite's $10/month "unlimited storage" is a very interesting alternative to ACD (even at twice the cost) if some awareness of the transfer thresholds was available. I'm very curious what insight you can provide here!
The ability to create share links for any file is also pretty cool.
- It supports Python 2.7 only. We need Python 3.4+ support.
- We can't increase CPU allocation without increasing RAM allocation, making them far more expensive than we need.
- Using psycopg2 on it is a PITA due to their handling of system dependencies.
- The system is entirely proprietary, making it impossible to run it locally for testing.
- Cloudwatch sucks for finding errors in the functions and is atrociously expensive.
- API gateway is an extremely crufty system, and used not to let you pass around binary data (this has changed)
- We can't disable/change the retry-on-error policy.
We have a pretty hard tie-in to S3 and Redshift, but when GCF can do better on a majority of these points, we'll begin moving to it. But yes, Python 3 at a minimum would be a requirement.
I assume that you are referring to emulating the triggering of lambdas behind API gateway...? I've found a project that sets up a node environment to do this. Very handy for js/lambda development. A google search suggests similar options may exist for python.
I had a lot of "chicken and egg"-type questions about using it, and seeing that critical step of bootstrapping the whole thing via the API Gateway was really informative.
In support of my flippant remark I see three indicators that hold parallels to Betamax with detail to follow. I qualify that it is largely informed by my own anecdotal experience. Specifically by objections and responses that I've received/observed while myself and peers have proposed or implemented cloud adoption at various companies.
1. market share.
2. proprietary tech stack.
3. technical superiority syndrome.
1. Currently AWS has a major lead, then Azure, then Google. The implication is that market share translates to mindshare, which in turn yields blog articles, OSS libraries/tools, etc. This becomes a virtuous cycle.
For .NET shops that marketshare will tend to favour Azure on the premise that MS knows best.
2. Some of Google's technology stack has a learning curve that is unique to Google. Take GAE as an example and compare to AWS's nearest equivalent Beanstalk (or Heroku). Beanstalk requires few if any changes to an existing application whereas GAE requires that you do it the App Engine way. It might provide a number of benefits, but it's invasive. Containers are shifting the requirement, however not everyone is in a position or has the desire to start with containers on day 1.
Further Google Cloud's project oriented approach while not a bad organisation mechanism detracts from learning. If you assume the premise that exploration is part of learning it forces the user to hold two items in their head: their objective and Google Clouds imposed objective.
AWS on the other hand generally provides defaults that allow you to launch resources almost immediately after sign-up. Google's approach is better for long-term support, maintenance and organisation but the user needs to have the maturity to understand that benefit.
3. It may be technically superior but that statement in of itself is divisive and can shudder some away. It is not enough to simply be technically superior and from my observation the statements tend to originate from X/Googlers.
A number of people will latch onto feature set (for beta, number of films available was a factor). The absence of features will often discount a choice out of the gate (even if those features are irrelevant) as an example:
- regional coverage:
AWS - 15 regions/~38 zones
Azure - 36 regions/zones
Google - 6 regions/18 zones
- partially/fully managed services: AWS is continually growing these, at a level that seems to outpace competitors.
- Outwardly Google appears to tackle the "hard problems" with technically superior solutions (e.g. TensorFlow, BigQuery) but often appears to neglect the "boring" problems a number of companies want as well (e.g. Cloud VDI's, SnowBall, etc).
- Some areas seem to be ossified due to tight coupling (e.g. servlet 3.0 and python support in GAE).
There is no silver bullet solution. Every provider will have an outage at some point and this could be a big reason that GCE won't be knocked out of the game. I also think Google is working really hard to build community and mindshare. I don't have a crystal ball so only time will tell what happens but technical superiority has rarely been the sole reason that drives adoption.
The S3 keys it produces are tied to your developer account. This means that if someone gets the keys from your NAS, he will have access to all the Cloud Storage buckets you have access to (e.g your employer's).
I use Google Cloud but not Amazon. Once I wanted a S3 bucket to try with NextCloud (then OwnCloud). I was really frightened to produce a S3 key with my google developer account.
As another option, you can continue using the XML API and switch out only the auth piece to Google's OAuth system while changing nothing else.
There's a lot more detail available at: https://cloud.google.com/storage/docs/migrating
Disclaimer: I work on Google Cloud Storage.
I like GCS (and the gsutil tool) but occasionally a S3 style bucket is needed. For example you need a S3 bucket or a webdav server in order to send alerts with images from Grafana to Slack. A minor issue but nice to have if possible without having to deal with Amazon's control panel.
To be honest, I do find the GCS permissions a bit complex. You have IAM, you have ACLs and you have S3 keys. Everything is set in a different place and ACLs aren't fully represented on the developers console. S3 keys give full access to everything, IAM service accounts give access per project and ACLs are fine grained (per bucket/object). On the other hand, IIRC, IAM has a write only setting, while ACLs do not. So I can have an account that can write only to all the buckets of my project but not an ACL (not that useful).
Kicked the tires, not impressed at all. Notes went missing from the interface could only get them back after manually digging through folders via FTP.
Your Egress prices are quite a bit more compared to CloudFront for sub 10TB (.12/GB vs .085/GB).
The track record of s3 outages vs time your up and sending Egress seems like S3 wins in cost. If all your worried about is cross region data storage, your probably a big player and have AWS enterprise agreement in place which offsets the cost of storage.
As to our network pricing, we have a drastically different backbone (we feel its superior, so we charge more). But as you mention CloudFront, the right comparison is probably Google Cloud CDN (https://cloud.google.com/cdn/) which has lower pricing than "raw egress".
Not only is webpagetest.org a google product but it's also much better suited for the minute by minute billing cycle of google cloud compute. For any team not needing to run hundreds of tests an hour the cost difference between running a WPT private instance on EC2 versus on google cloud compute could easily be in the thousands of dollars.
Just saying, it gets you a foot in the door.
if you are api compatible with s3, could you make it easy /possible to work with google storage inside spark?
remember i may or may not run my spark on Dataproc.