"Cloud Run is a managed compute platform that enables you to run stateless containers that are invocable via HTTP requests. Cloud Run is serverless: it abstracts away all infrastructure management, so you can focus on what matters most — building great applications. It is built from Knative, letting you choose to run your containers either fully managed with Cloud Run, or in your Google Kubernetes Engine cluster with Cloud Run on GKE.".
So, like lambda but with containers. Pretty much what I have wanted since I first started learning about Docker.
First off, thanks for the overview text. I really hate how many tech people just assume ‘you’ know what they mean because ‘they’ know what they mean. It’s like they’re trying to accentuate the autistic stereotypes!
>Pretty much what I have wanted since I first started learning about Docker.
Why? I have a use for lambdas in an application I’m working on. But I don’t really get why you want stateless containers that spool up and down on demand. Is it for things like “I need this entire existing program to run and can’t refactor it to a lambda itself”? I only need to do some API processing and data we work, so I’m not entirely sure what starless containers are for. Might be a dumb question!
As someone who works in Data Engineering, I'm a big fan of cloud-based docker containers with services like AWS Fargate/Google Run. It allows us to be able to pay for resource use rather than provisioning. For the longest time I had to justify the additional cost and constant maintenance of an n-th server just to accommodate for capacity at peak hours. Anecdotally, most of our batch jobs don't need high availability, so the startup time of a server-less docker is an acceptable trade-off to parallelism.
> It allows us to be able to pay for resource use rather than provisioning.
Not arguing against the choice, but at some level, you are paying for the provisioning costs of the cloud provider. The higher initial costs of provisioning is what you may be escaping, and instead paying on an ongoing basis (as if you’re leasing equipment and services).
Once your needs scale significantly, cloud providers would end up being quite expensive if your load is not highly variable.
It's compatible with Knative, which is easy to set up in GKE, so you can migrate loads over to that if you want more control over your compute costs. And of course, you can run Knative outside of GKE, too.
Disclaimer: I work on GCP but have only touched Cloud Run/Knative a little bit.
You know you can also boot instances that bill at hour and less granularity right? Billing is probably the least compelling argument to a service like this.
With Fargate you have containers running all the time, and you pay for that....from what I've seen of Cloud Run, you just pay when you're actually serving requests
With Cloud Run you're only paying for resources used, and with Cloud Run on GKE you're paying for the underlying cluster [1]. This also gives you the flexibility to optimize costs by shifting workloads between the two if desired.
Hi, I'm a product manager on Cloud Run.
Thanks for your enthusiasm and feedback. We are very excited to share this new product with you. The reason we remain silent at the moment is because Cloud Run will officially be announced at 9am PST tomorrow during Google Cloud Next 2019. We will publish a blog post and other material that should answer many of your questions. We look forward to hearing what you think about it and what you'll build with Cloud Run.
I've used GCP in the past, including decisions on which cloud providers to use where we spent north of 1M USD/month.
Honestly Google's efforts are best focused on support issues like this and customer service rather than features to compete with AWS at the moment. A lot about GCP's setup is simpler and their network/hardware is well known to be better per dollar spent.
However, I've always found the ability to contact an AWS rep and work through a tough situation on billing/quotas to be much more convenient.
To be clear, this person seems to have reached Support (I'll know more if they reach out to me) but is probably mid-way through the "Are you sure you didn't do this?" or something particular to debit cards.
Enterprises don't have the same experience, because you don't use a credit/debit card to pay for $1M/month in anything :). As an Enterprise, you also have dedicated folks in sales and professional services working with you, and so on.
I don't disagree that Support and customer empathy are huge factors of what goes into picking a provider. We need to improve, likely more than other providers. We all hate knowing that if you happen to know someone at Google, you get better help.
But, tl;dr: GCP Support != Google consumer "support" and individuals on credit/debit cards != Enterprises.
Hey boulos, upvoted your response. Thanks for taking action on this. Agreed that GCP != Google consumer support.
I didn't mean for my response to be perceived as negative for GCP, was only hoping to inspire those at GCP to continue to focus on the customer experience rather than features. AWS certainly needs the competition and I've also had great experiences using Google's cloud platform.
Message received! I just wanted to highlight that it's easy to extrapolate from individual accounts, but that can trick you. That doesn't make the current situation okay, and we get that.
You're correct--but most enterprises start out with credit card accounts when some line manager greenlights an experiment. "Time to go drop $50 million on a cloud provider this year" doesn't happen overnight, and it doesn't happen in a vacuum. Experiences like this with early-stage accounts cause significant damage to the sales process--and you'll never see it coming until after the decision has been made.
Will Google guarantee a minimum number of years to support this platform? I would hate to dedicate time and resources to this (sounds pretty useful) and then have Google shut this down in the near future.
Hi. I manage App Engine (the original "serverless" product at Google).
This is a very understandable concern, given the importance of having a platform on which you can rely.
Contractually Google Cloud provides a 1 year notice before discontinuing (or making backwards incompatible changes) to products. This is for generally available (GA) products. Cloud Run is in beta, so technically it could be decided not to bring it to GA. This is why some conservative orgs tend to wait for products to be GA before releasing them.
From a technical perspective, Cloud Run was designed to be highly portable and idiomatic. If the service were discontinued (or you just didn't like it), you should be able to take your container image, and run it anywhere else. Odds are you would be using some other Google Cloud Services, so you would likely want to run in an environment with low network latency to Google Cloud (Compute Engine and Kubernetes Engine being obvious candidates).
From a historical perspective, I'd say that Google Cloud goes above and beyond in supporting older products. App Engine is about to hit its 11th anniversary. We are still running PHP 5.5 apps and backporting security patches to the runtime, despite the language losing community support 3 years ago. We are still turning down an old product called "Managed Virtual Machines", which has now been in a deprecated (but running) state for longer than it was GA!
From an emotional perspective, I think that Google is eyed with a lot of suspicion for turning off products. Google Reader - enough said. But as someone on the thread pointed out, Google Cloud is a very different business from the rest of Google. Google (!cloud) is a consumer company at a scale where if a product matters when it hits a billion users. Google Cloud is an enterprise company. Scale still matters, but not in the same way it does in consumer.
I can't wait for hacker news folks to try Cloud Run. Its an awesome product.
We usually extend the deprecation timeline if a product is important or is hard to migrate.
Master/Slave datastore deprecation takes 3 years since the announcement of deprecation.
Python2.5, 4 years since deprecation announcement to fully deprecated.
Java 6 was officially deprecated in July 2017 but if you still have an app deployed in Java 6, chances are they can still serve traffic just fine. Same applies to Java 7 (this is partially due to JVM backward compatibility but there are non-trivial engineer works involved)
I hope this gives you some confidence in Google's cloud offering.
Cloud Run is in beta, so technically it could be decided not to bring it to GA. This is why some conservative orgs tend to wait for products to be GA before releasing them.
Ah that's one reason not to use GCP betas. Another big one is the complete lack of any public uptime target. In my opinion, this makes the betas nearly as bad as alphas with respect to using them in production.
That's precisely the point: don't use Betas in production, unless you're okay with that. Do you have a suggestion on wording for the help text to reiterate that more clearly?
The background here is that a Beta product is still in flux. In particular, it might not be GA yet because it hasn't yet met its internal SLO for enough time, proving that it can consistently meet the SLO for its SLA.
While we could let products ship randomly, since SLAs "just" mean we pay you if we don't meet them, we choose not to. Customers expect that if a product says "this is our SLO/SLA" that we intend to hit that.
We hear you though; we don't like super long Beta durations any more than you do. Sometimes though, we reached Beta and didn't realize we hadn't met the quality bar we wanted.
Really depends our your capacity for risk. If you're a small startup team, getting the benefit of automatic scaling with extremely little management overhead (especially compared to something like Kubernetes) could be worth the lack of explicit uptime SLAs.
It is an important question, but Google services which are largely consumer-based and no cost are a very different world from Google Cloud. People get a lot of Internet Cool Guy points these days from falsely conflating the two.
I'm not the to make commitments but our history has been solid, and our deprecation policy is embedded in our terms of service for every Cloud user.
The only product I can remember us deprecating in GCP was Prediction API in favor of the much-preferred Cloud Machine Learning Engine, and with that came communication to every single affected admin and a _year_ before cut-off.
Uh, just last year Google increased the price of developers using the maps API by an order of magnitude with no grandfathering policy, which for many users was effectively shutting the API off completely.
It's not just the consumer side; Google doesn't exactly have a spotless reputation for developers, and assuming the post must be somebody faking being concerned for "Internet Cool Guy points" is just tone deaf.
If nothing else, what do you expect Google to say? "No, we'll definitely be dropping this"?
The services that Google dropped were either big bets that ended up as massive failures in the marketplace (e.g. wave, google+) or services that were neither big nor strategic to google (e.g. google reader, google code).
GCP is neither.
If you're not convinced by the fact that Google has been pouring billions into Google Cloud for over 10 years now, organizes multi-day conferences promoting it, launches new features monthly, then no rational argument will convince you that they won't drop GCP because they also dropped google reader.
Google Cloud has had zero dead products in a decade existence. Just because Google phased out a couple browser extensions and a RSS feed reader doesn't mean they have a "horrendous track record" when it comes to real enterprise-tier services such as Cloud.
It's not the dead products on my case I fear but the lockdown of a Gmail account, there's already 2 gmail accounts I can't access despite knowing the password. I would not use GCP just because of that, everything might work fine and then one day you're locked out of your account and can't do anything.
Just had this with an account I setup for my parents. I have recovery email AND know when account was created - not good enough. My parents use a landline that can't get a text and apparently enough to deny them (Google won't Dona voice call). Would pay 50 to unlock. Only option is to setup a new account. Ugh. This same email used for Apple Id and they like their Apple stuff
I'm in the exact same situation, I know everything about those accounts, password & recovery email but that's not good enough. Why would I trust Google with GCP when I have personal experience that those accounts can and will be locked down without a valid reason? That does not seem a good business idea.
Does anyone have a head-to-head comparison of Google vs Amazon vs Apple cancellations? I guess the Facebook equivalent would be what of our personal information they've newly managed to monetize.
Does Cloud Run have accurate NTP time across edge servers and througu Cloud Run? I know Google runs their own version which stretches time on leap years, but I'm wondering if there's any effort to make sure that these edge servers are properly synchronized. That way things such as sub-second video stream synchronization can be done across the globe.
Thanks very much! Hope to find out more tomorrow, but just wondering if there is any equivalent to the "cold start" issue you have with Cloud Functions on Cloud Run. If not, could you point to any documentation that explains the lifecycle of how containers are loaded, scaled up and scaled down in response to requests?
Quick question out of curiosity, we have a Google Cloud org of three members at work. One member had access to Cloud Run some days ago, he stumbled upon it incidentally in the sidebar. All other members couldn't access Cloud Run. So, how come? Is early access account, not org-based?
(I'm a product manager on the serverless team at Google Cloud)
The early access was based on a user account being added to an access list. But, I'm not sure why your team member had access without explicitly asking for it.
Hi thanks for the reply. The only thing he could think of was that he skipped through a couple of videos on Google's official YouTube developer channel that day. But likely coincidental. (On a sidenote, when my coworker had access, autoscaling was broken, like completely, hence in part my curiosity).
Please tell your salespeople to point out to potential customers that since Fargate billing is per vCPU-hour, costs on Cloud Run are "up to 36,000 times cheaper!"
There are two reasons why our company will not use Cloud Run, even though we would love to, the concept is just what we are looking for for our containers:
1) We are on AWS and expanding to Google is too expensive. Support costs for three engineers to be able to contact Google is $750/month for production workloads. At AWS we pay $200/month as our total spend on infra is about $2000/month. We are a startup with a tight budget, hence the increased spend is too tough to swallow even though we would love to use Cloud Run.
2) We have an Android app. Having seen how other companies have been hell-banned and shut down from having invited devs and other with flagged accounts we just don't dare to use Google Cloud while having an Android app should we end up on the automatic hell ban list at Google. Too risky.
Is there a reason you can't/wouldn't want to use Knative? Or what if you can run something like "Cloud Run on GKE on AWS" (where your GKE cluster is managed [updated] by Google but running on AWS)?
Sure, we could, but I suppose one of the big offers here is that it is managed and we don't have to deal with running it ourselves. GKE on AWS doesn't solve my (1,2) concerns.
Am I reading pricing right, that it is $0.40/million requests beyond free quota versus $3.50/million (first 333 million) on AWS API Gateway? One thing that always killed AWS Lambda/APIG for our use cases was the API Gateway request pricing. This seems much better.
I love that you can just run Docker containers. My one wish is the option to run on multiple regions at once with Anycast or similar.
I haven't tested yet to see what extra latency or cold start times this may have. That is the other thing that bothers me about Lambda.
> that it is $0.40/million requests beyond free quota versus $3.50/million (first 333 million) on AWS API Gateway
Comparing Run to API Gateway is a bit apples to oranges, as one is a managed compute product and the other is an API management product.
Run (and GCF, and GAE) automatically provision an HTTP endpoint for you so you aren't required to use an API management product like Cloud Endpoints. However, they don't provide API key validation, rate limiting/quota, schema validation, etc. which is what you're paying for (even if you're not using them) with API Gateway.
A more apples to apples comparison would be something like:
- Cloud Run ($0.40/M) + Cloud Endpoints ($3.00/M) = $3.40/M
- Lambda ($0.20/M) + API Gateway ($3.50/M) = $3.70/M
> My one wish is the option to run on multiple regions at once with Anycast or similar.
Thank you for your response, that definitely helps me understand the differences.
In our particular higher volume use cases, we handle the rate limiting/key validation ourselves so I like that Run automatically provisions a HTTP endpoint.
Right now, I run ad hoc containers/bots cheaply on GCP by using Cloud Scheduler to turn on a preemptible VM which invokes a Docker container, run a function, and immediately shut it down. (technical details here: https://minimaxir.com/2018/11/cheap-cron/)
Cloud Run seems to obsolete that workflow (and it fits well within the free tier), and I couldn't be happier. Will watch the keynote tomorrow for more!
I think this has the potential to be huge. Finally a FaaS platform without vendor lock-in, supported by a major cloud vendor, and uses regular containers. Basically removes all of the major drawbacks to move to serverless (for applications that suit the FaaS model).
Since the technology is based on knative, it should in theory be possible to run these on a local k8s instance. It would support some debugging use cases.
I do agree as far as cold starts and runaway expenses go, although those issues seem inherent with FaaS.
The biggest issue isn’t being able to run locally, it’s getting access to systems where an unexpected error is happening in production that didn’t happen in dev.
I agree, and looking through some of the doc it looks very easy to set up, but note AWS has Fargate, which is "a FaaS platform without vendor lock-in, supported by a major cloud vendor, and uses regular containers."
Potential to be huge for sure, but with the way Google sunsets projects on the regs, I'm not sure I fully trust it as a robust solution long term. Granted with container-based-deployments, it should be trivial to switch between services which would mitigate that. It's certainly interesting, but I may wait a year or so before I trust it for prod.
Took a test run using gcr.io/cloudrun/hello-image with 128Mb memory and 80 concurrency: cold response about 450ms, warm 2-5ms (as reported by logging). Scaling seems to work, got about 200 requests/s with quick ab-testing. Domain name mapping + HTTPS is nice addition.
Price-point: using 128Mb+100ms "defaults" from other serverless pushes price over $3/million requests. If any concurrent requests going, price/request goes under $1/million. And don't forget network prices, hello-image return a bit over 4kB so that means $0.46/million requests.
Biggest concern is overall latency, from EU I got 1020-1300ms total latencies. Tracerouting address gives 60ms "ping" latency. And sometimes total latency is 220-250ms (less than 10% requests). This really needs some inspecting. Otherwise pretty nice service, I have been waiting something like this :)
So is this basically App Engine Flexible Environment with a quirk that it pauses (freezes for later thawing) your container (like GCF or AWS lambda) when not handling a request?
Would be good to see more info like concurrency limits (80 mentioned in the quota is super low...) and cold start times. Things that affect how useful or not useful this is for handling large bursts.
Likely some cascading logic to suspend to nvdimms, then local storage, then spin down entirely if no requests are made for a certain period (with some variable based on instantiation time I'm sure). Mix in machine learning pre-emptive loading for observed request patterns and it's pretty manageable.
In order to be able to do that, and to save costly resources (memory and CPU) you need the workload to be interruptible vs flexible environnent which is persistant. In a sense this is closer to lambda but supporting native containers if I read into it correctly.
Where do you see that? I'm curious to read more about what happens when the service is "idle". What does freezing mean? What's the latency to resume an idle service?
Yes but initiated (and automatically suspended) based on a web request. Azure's is from (and billed from) container pull start until terminated (by you).
Aah, thanks for the clarification :) the top voted comments on this article now indicate that as well, so hopefully no one else will have the same confusion.
How much CPU is allocated to a Cloud Run (not-on-GKE) instance while a request is active? I see that memory is configurable (up to 2G), but no word about CPU...
I have a compute-heavy and bursty workload that I'd _love_ to put on Cloud Run, but it's important to know a ballpark for the CPU I'll get to spend on my requests.
Second question: any plans to more officially support "background" workloads that consume off e.g. Pub/Sub and might be able to use cheaper preemptible compute? I guess I'm probably already able to point a Pub/Sub push queue at a Cloud Run endpoint, but having the option of cheaper (autoscale-to-zero) compute for my not-latency-sensitive work would be awesome.
I think the tech that allows Google to measure the use of its infrastructure in the way they're doing here might be a part of the strategy for Stadia (stadia.google.com). Pricing info was conspicuously missing from Google's Stadia introduction, and I've been wondering if Google's use-based cloud billing model could work if it were just passed on to the user - bill the user (or player) to the minute or even to the second.
Deals could be structured so publishers get a cut of fees that are tied how long players want to play the game. Billing individual seconds of gameplay would be a radical shift in the incentives that drives how games are designed and distributed.
Not to mention how you play games and pay for them. Perhaps there would be a small base monthly fee but also, perhaps not.
High-performing flagship titles could bill players at 3-4 cents a minute, while less popular or less computationally demanding games could be a significant fraction of that. And streamers that are successful enough will see Google paying them to play.
This certainly feeds Google, but it also ties publisher revenue to the playtime their titles get. I'm sure there are pros and cons to that, but it's interesting. So is the potential opportunity for players to have a chance at a piece of the action too if streaming is monetizable.
All this, by effectively _removing_ a layer of pricing abstraction. Of course, Stadia has to work first. But even if it's not from Google I think something like this is coming.
I'm getting ready to deploy a Cloud Functions service. I wonder what I'd gain by making it a Cloud Run service instead? I imagine cold starts for Cloud Functions are highly optimized (since they're runtime-specific).
I don't fully understand how cold starts would manifest in Cloud Run, but in any case they should be MUCH less of an issue than in Cold Functions.
Cloud Run has a default (and max) concurrency threshold of 80 requests per container. Thus, for example, if you are running Node, you could conceivably be handling 80 concurrent requests before needing to start another container.
Cloud Functions have no concurrent requests (each instance can only handle 1 request at a time) so every concurrent request results in a cold start if additional instances are needed to scale up.
Note, though, that this means you can't put per-request state in the global scope in Cloud Run, but this is safe to do in Cloud Functions because you know there that each instance is only handling one request at a time.
I'm not sure this is actually the case, Cloud Functions instances might be handling one request at a time, but they call also share a global scope, as you can see in their Tips & Tricks document[1]
No, what I meant was that it is safe to keep request-scoped data in global scope without needing to worry that a concurrently running request will interfere with the state there. Normally in a Node server you cannot do this.
Cloud Functions will reuse instances for subsequent invocations (that's what "being warm" means), but separate instances do not share global scope.
You gain complete flexibility over the language, environment, and running time, whereas cloud functions have a fixed set of runtimes and hard execution timeouts. When it’s ‘I just want to run my container’ this seems a lot more appealing than K8s and arguably requires less cognitive overhead to fit your app into the Function as a service paradigm. If you already have Cloud Functions working then run with that!
When Fargate launched it had a very high premium on pricing over EC2(like 3X!) which has really come down, but I still wonder how Cloud Run compares. I’m on my phone and too lazy to do the math, but my experience with other GCP offerings has been that compute and RAM generally cost similar across services and abstractions.
Okay fine, I'll run the numbers... There is indeed a substantial price premium for this platform:
1 vCPU /hr on Cloud Run clocks in at $0.0864/hr compared to $0.0331/hr on VMs with custom CPU or $0.0316/hr for predefined.
1 GB /hr on Cloud Run is $0.009 compared to $0.00094 for Custom RAM or $0.00089 for predefined.
I can assume you can also forget about sustained use discounts.
I would imagine the costs of resources to deliver hot-start performance is significant, but this makes the case for the service a lot weaker, especially for long-running jobs that could just be tossed on a preemptible VM for dramatically cheaper rates.
When you compare Cloud Run vCPU pricing with preemtible VMs it becomes nearly 13x the price.
Most use cases could trade startup delay for reduced costs by running GKE with preemtible nodes and Cloud Run on top of it. Otherwise "normal" Cloud Run and Cloud Run on GKE could be combined into a system where on the first request an instance on serverless Cloud Run is started and simultaneously an GKE Cloud Run instance of the same container is ramped up to take over.
Thanks Google cloud team this looks really great. Excited to hear more on the Next livestream. Questions: Is Cloud Run a zonal or regional resource and is there any way to put this behind the global load balancer? Is Http2 (grpc) support planned? Thanks.
Does max. RAM usage need to be specified upfront, or what's billed is the live-allocated amount? This is important for scenarios like running Chrome Headless when the RAM usage varies a lot between invocations.
Has anyone measured cold start times? I know that the Istio/Envoy combination can add 1-2 seconds on init time, plus the pulling of the container image to whichever node/region is decided by the scheduler.
And GCM was literally just renamed FCM. It’s been years since I worked with the libraries but iirc GCM v3 registered tokens worked with FCM. I don’t recall all the specifics of C2DM other than that it was recipe enough that we built a whole push network at Parse instead.
I use Google Cloud Scheduler, https://cloud.google.com/scheduler/ , to invoke Cloud Functions (actually Firebase Functions which are really Cloud Functions under the hood).
You set up a Firebase function to run on receipt of a message to a pub/sub topic, and then in Cloud Scheduler you schedule a post to that topic.
Scheduling is easy on many levels. One could set a CICD job to do that (which is usually free in many cloud vendors).
The business value though is in data-based events. Data is inserted/updated/deleted in databases (SQL and NoSQL), event streaming, queues, identity management systems, and so on.
Until there is a way for Cloud Run to easily plug its 'functions' to those sort of events, the solution is not ready for real world use cases -- unless of course one is interested in building a whole distributed system architecture around webhooks.
I like that idea, say after N inserts to a database/bigquery or N requests to existing service, after a file is posted to storage, or once a VM variable hits a certain trigger, kick off cloud function. Great for mini batching ETLs, log cleanup/parsing, upload possessing, cache invalidation... So many use cases.
That flexibility would be great. Hope this is on the radar! Cloud run triggers. Being able to pass the event metadata in would be great too.
Doesn't seem like that needs to be a long term hard contract of the system. It sounds like it only supports HTTP/1.1 right now, but I can see HTTP2 support becoming a thing and that would then enable GRPC.
Hooks are a popular approach for implementing notification callbacks for long running requests over HTTP.
Bidirectional streaming is irrelevant here, the question is what's so limiting about using HTTP for this use-case? (i.e. by far the most popular protocol used for invoking remote service requests).
"Cloud Run is a managed compute platform that enables you to run stateless containers that are invocable via HTTP requests. Cloud Run is serverless: it abstracts away all infrastructure management, so you can focus on what matters most — building great applications. It is built from Knative, letting you choose to run your containers either fully managed with Cloud Run, or in your Google Kubernetes Engine cluster with Cloud Run on GKE.".
So, like lambda but with containers. Pretty much what I have wanted since I first started learning about Docker.