2) Not having to worry about being hell-banned and not being able to restore your account with Google, such as "guilty-by-association" for devs in PlayStore/gApps etc trickling over to closing down the whole account. We have read so many of those stories here on HN lately that I'm terrified to use Google for anything business related as we run the risk of losing everything.
The Cloud Run product looks great and would love to use it but the above two issues makes it a no go for us.
They should be able to give you a pretty large chunk of credit (I assume it'll cover support as well). It should also give you folks to contact and help prevent the latter issue.
Disclaimer: I work at GCP and used to work in support.
Regarding suspension, if this is the case, why do we see so many desperate cries for help from people whom are unable to get in touch with anyone at all at Google to help them with their termination? They have to resort to public blog posts and hope that a good Samaritan at Google comes along, sees them and acts.
They all have the same story, "couldn't get in touch with anyone, "tried to appeal, denied without explanation" and they all had several days if not weeks of downtime.
From what I've seen, the stories about people having trouble tend to be from Android app developers getting blocked from the Play Store, not GCP users.
To put it another way: do you think the level of support that Google would be able to provide is less valuable than a few percent of an engineer's salary?
1. Sales teams can help here. You have a sales rep that you should meet/interact with so that if something like were to happen, you have a person you can reach out to. There has been a lot of news about hiring within sales.
2. Set up a proper Cloud Identity domain and organization. Manage users this way as opposed to consumer accounts.
3. Set up an invoiced billing account
A couple of direct recommendations to combat the fear of being banned.
> Today, we are announcing support for new second generation runtimes: Node.js 10, Go 1.11, and PHP 7.2 in general availability and Ruby 2.5 and Java 11 in alpha. These runtimes provide an idiomatic developer experience, faster deployments, remove previous API restrictions and come with support for native modules. The above-mentioned Serverless VPC access also lets you connect to your existing GCP resources from your App Engine apps in a more secure manner without exposing them to the internet.
We're super excited to announce Cloud Run and Cloud Run on GKE, both implementing the Knative Serving API. Please let us know if you've got any questions!
The main benefits you'd see immediately are toolchain (e.g. Docker containers, existing build systems, etc.).
Your intuition around concurrency is correct: Cloud Functions has "per instance concurrency" of 1. Cloud Run lets you go significantly higher than that (default 80). This means that our infrastructure will generally create more instances to handle a request spike when using Cloud Functions vs. Cloud Run.
Creating an instance incurs a cold start. Part of that cold start is due to our infrastructure (generally this is small) but the other part is in your control. For example: if you create a client that takes X seconds to you initialize, your cold start will be at least X seconds. The initialization time will manifest as part of your cold start.
This has a few practical implications:
* writing code for Cloud Functions is generally more straightforward as single concurrency solves many problems regarding shared variables. You may also see some benefits in terms of monitoring/metrics/logging since you only need to think about one request at a time.
* you will likely see a higher incidence of cold starts on Cloud Functions during rapid scale-up, such as in response to a sudden traffic spike
* the impact of a given cold start will depend heavily on what you're doing in your container
* though I haven't validated this experimentally, I would expect that the magnitude of any given cold start (i.e., total latency contribution) would be roughly the same on Cloud Run as Cloud Functions IF you're running the same code
I'll probably do some experimentation on my end as well to test. Any suggestion how long I should wait between tests to ensure a cold start on both Cloud Functions and Cloud Run?
As an aside, the "K_REVISION" environment variable is set to the current revision. You can log or return this value to test whether traffic has migrated to a new version (instead of waiting a minute).
I'd encourage you to test your particular app, but you should expect similar cold start times in Cloud Run.
You can set "Maximum Requests per Container" on container deployment so you are in control whether a container has single concurrency (i.e. "Maximum Requests per Container = 1"). If your app is not CPU-bound and you allow multiple concurrent requests (the default) you should see fewer cold starts.
1. I think I have a pretty good understanding of what's going on with the lifecycle of Cloud Functions that leads to the cold start times. What happens with Cloud Run? Does it need to download the whole Docker image to a machine to run it? Seems like that would take longer.
2. App Engine has 'warmup requests', which I think are great. Is there any equivalent on Cloud Run, or plan to add?
3. Is the time that an instance is kept warm during idle similar between Cloud Functions and Cloud Run?
Cloud Run supports multi concurrency by default. You might see some benefit due to that difference (fewer cold starts, less CPU time allocated)
I noticed that the console shows a traffic percentage for each revision, but no apparent way to change it. Any plans to support traffic splitting, or at least a one-click way to re-activate (or re-deploy) an older revision?
Currently Cloud Run only supports the `runLatest` mode of the Knative Serving spec (https://github.com/knative/serving/blob/master/docs/spec/spe...) but we're working on other modes (e.g. `release`) which would allow for traffic splits.
Doesn't App Engine Flex essentially help you run an app in a container and handle scaling too?
For example, what if you're running a stateless JVM app like something on Play Framework. Could you run it on either App Engine Flex or Cloud Run, and if so, what considerations would there be for choosing one over the other?
1. Flex runs directly on VMs (sharing some GCE networking, access to GPUs, etc.), Run doesn't.
Those are the two that come up top of mind. We're working on a more comprehensive "choose your compute product" walkthrough in the near future.
It sounds like there's a bit of overlap. If I'm not up and running yet and just want the most managed solution (least work to run, with the most powerful high level features), is one of these a clear choice?
In my case, I don't care as much about low level access, but I just want my app to run quickly and efficiently. App Engine Flex's ability to run multiple services within an application, get load balancing set up, have versions of the app for testing and rollbacks, etc seems great. However, overall I just want to the easiest way to run my stateless set of services.
If you could comment on when/whether compute instances will get IPv6, that would be great also :)
I am massively excited about Cloud Run's free tier---for someone with a budget of zero, being able to get a project off the ground and functional without paying through the nose if you forget to turn down some service is incredibly useful. Getting an unexpected $40 App Engine bill at the end of the month isn't fun.
I'm definitely going to check Cloud Run out as a go-to for future projects--it looks like a really good fit for my use case.
Even silly things like https://howdelayedissfo.com (built over the weekend with Cloud Run and Firebase Hosting) are great ways to improve your skills without breaking the bank (projected costs are well within the free tier on both products).
Cloud Run can be single or multi-concurrency (defaults to 80), meaning one in 80 requests globally causes a cold start (again, generally speaking, assuming the container isn't CPU bound below that, etc.).
So, if both are running in single concurrency mode, you'll likely see similar cold start time; however, since you can go multi-concurrent in Run, you'll likely see better performance, especially when scaling up.
A nice side effect of this is that your costs will also drop, since you can pack more requests into your instance.
Edit: as noted in the other thread, because you control the container image, you can optimize that (e.g. use Alpine instead of Ubuntu) to reduce weight and decrease cold start time. But if you're using a language like Java, it's possible that starting the JVM and framework is going to be more expensive than the OS load time, so it might be a wash.
I assume my container runs alongside other containers on the same host, how do you prevent privilege escalation exploits?
Containers are run in gVisor https://github.com/google/gvisor
as the container sandbox.
(Disclaimer: I am Cloud Run dev)
The short answer is that read-only mounts are going to be super easy, but read-write is going to get really interesting (especially since there can be N instances all trying to write).
You can also set up Cloud Scheduler to push to Pub/Sub, which can trigger your function. This is helpful if you don't want your function to be available via a public HTTPS endpoint.
functions.cron.schedule('0 0 * * *').onSchedule(...)
or something like that. Even if it behind the scenes it just did exactly what I'm doing manually now, I see so many questions about cron-triggered functions that it would save a ton of people time searching for what the best route is to take.
You can configure the amount allocated: https://cloud.google.com/run/docs/configuring/memory-limits
(Disclosure: Google Cloud PM)
I'm not sure what do you mean here, Cloud Run uses Docker which runs regular processes in a cgroup, so it's sufficient to check the cgroup memory usage, right? Yes, Java can always use large heaps but we're running Python and C++ where a process' memory usage directly relates to what a program allocates (even PyPy with GC has this property).
> The mitigation in Cloud Run is both concurrency, and that you're only billed while a request is active.
When there are memory peaks, larger deployments without container-level concurreny look better. For my example 16GB of RAM allows running 8 containers to get a chance for a 2GB task to complete, but on average 90% of the memory will be wasted. On a single 16GB server I can run 48 tasks with 40% wasted and a high chance of the 2GB tasks finishing. Yes, in this scenario I must handle tasks killed due to OOM but the difference in throughput is so large that it's worth it.
Cloud Run uses gVisor as its container runtime.
Disclaimer: I work on GCP but not much on/with Cloud Run.
The GCP-managed infrastructure exposes the Knative API, but doesn't actually run in GKE/k8s.
(I work on GCP but not much/on Cloud Run. Above is correct to my knowledge, but I'm not an expert.)
The default is 128M and you can specify up to 2G
Cloud Run has a limit of 80 concurrent requests to a copy of your app. Cloud Run has a limit of 2 GiB for memory.
Flex gives you more flexibility over VM shape (CPU, mem), so is suited better to apps that have a higher load. Flex does not scale to zero.
Cloud Run deployments should be faster (Flex provisions a load balancer for each deployment, which can be slow).
Both support deploying directly from container image.
Disclaimer: I work on GCP but have only worked a tiny bit on/with Cloud Run.
Cloud Run will have support VPC Connectors soon (it is supported, we just haven't wired up the API/UI). After that, its your choice, they run on similar infrastructure so you just need to decide if you want to live in containerland or source code land.
isn't that AWS Fargate?
The exciting part of Cloud Run for me is not that I don't have to manage a Kubernetes cluster, but that I don't have to pay for it when my service is sitting idle.
> Pricing is per second with a 1-minute minimum. Duration is calculated from the time you start to download your container image (docker pull) until the Task terminates, rounded up to the nearest second.
Details the contract your container must meet to run on the service and limitations you might run into
Quite a nice document to be fair.
Would be interested in hearing what use cases people would use this for, over say Cloud Functions or AWS Lambda. I'd imagine the flexibility of being able to run anything that supports HTTP is quite attractive.
One thing I did spot though is the container instance is limited to one vCPU, I wonder if people will hit performance ceilings? For lambda they abstract the CPU allocated based on the memory tiers, I'm not sure if this is the same though
This appears to be the accompanying blog post. So, not a dupe and explains the higher level goal here. The post from yesterday, was before the product was actually publicly released, if that gives more context as to why this is new news.
Surely it won't take long - the product itself has a gRPC API: https://cloud.google.com/run/docs/reference/rpc/
You can run in GCE a gRPC server that whenever it gets a gRPC request, it temporarily stores the gRPC message and associates it with a session ID. It then sends a HTTP request to Cloud Run with that session ID. Then your Cloud Run instance will take that session ID to make a gRPC connection to your gRPC server in GCE. This GCE instance will then take the session ID, retrieve the gRPC request, and forward it to the Cloud Run instance.
This is admittedly hacky, but depending on your use case, may be good enough.
On the comment about how the control plane APIs support gRPC: the serving infrastructure for the data plane is pretty different from the control plane, so it's unfortunately not a direct map to supporting gRPC on the data plane.
Another possible solution is to run an API gateway that does gRPC to JSON conversion and have that invoke your service.
EDIT: looks like we could do unary gRPC if gRPC wasn't only supported on HTTP/2. So the current implementation basically requires we solve sticky sessions/streaming.
Cloud Run on GKE brings a managed experience for Knative/serving and Istio that aligns with Cloud Run. We install and manage the Knative version in your cluster and keep it running for you. Above what you get with base Kubernetes ability to deploy a container, Cloud Run on GKE gives you request-based auto-scaling of container instances, network programming with Istio, and Stackdriver integration for logging, monitoring, and metrics. You also get the Cloud Run UI and CLI to deploy, manage, and update your services.
One of our core goals with Cloud Run is to enable serverless portability; not just for workloads, but tooling and developer experiences. By offering Cloud Run in both a hosted and Kubernetes platform, both enabled by Knative, you can leverage compatible serverless tooling and knowledge across the range of platforms from hosted to GKE to k8s anywhere.
This post  explains it a lot better. Quoted from the doc. ( Why they didn't include it in the page is beyond me )
"Cloud Run is a managed compute platform that enables you to run stateless containers that are invocable via HTTP requests. Cloud Run is serverless: it abstracts away all infrastructure management, so you can focus on what matters most — building great applications. It is built from Knative, letting you choose to run your containers either fully managed with Cloud Run, or in your Google Kubernetes Engine cluster with Cloud Run on GKE.".
So, like lambda but with containers. Pretty much what I have wanted since I first started learning about Docker.
# deploy to the fully managed product
gcloud beta run deploy --image gcr.io/foo/bar --region us-central1
# deploy to a GKE cluster
gcloud beta run deploy --image gcr.io/foo/bar --cluster myCluster --cluster_location us-central1-a
Cloud Run on GKE specifically adds the Knative CRDs (e.g. Knative Serving) to your GKE cluster, giving you additional autoscaling and different rollout primitives.
Is it just us-central?
Currently, I use GCE with create-with-container and Cloud scheduler to manage batch loads.
I understand this is kind of a hard CS problem and is basically rooted in needing to move a lot of data around very quickly, all while making the software low latency as well.
But solving these is kind of the point of a FaaS.
Otherwise we could just run containers on VMs and autoscale ourselves. With terrible cost and cold start times.
Maybe look at how Jelastic does it for a unique take on this. https://jelastic.com/public-cloud-pricing/
This is probably the model that would make users the happiest.
Having poked a bit at how Jelastic does it, it seems they drop a pool of users on a 32-core machine and load balance the users containers with live migration to less utilized machines.
Imagine GCP doing something similar. Drop a big pool of users onto 96 vCPU instances.
If an instance starts to get overutilized, live migrate some user containers to a new instance.
Same type of thinking that is probably behind AWS Lambda: how do you satisfy bursty workloads for a lot of users cost effectively? Pool the users. Assume they won't all burst at once.
That has got to be one of the biggest benefits to large public cloud computing.
If so, no, though we're working on supporting Cloud Armor, CDN, Load Balancing, etc.
There are some interesting billing implications of persistent connections (e.g. we don't know what's going on in that session [if there's traffic or not]), so we'll likely have to bill you the entire time the connection is open.
I'm quite aware that there's a technology hype cycle in web development, quickly replacing last year's fad. But whoever came up with the idea of putting the words `traditional` ("following or belonging to the customs or ways of behaving that have continued in a group of people or society for a long time without changing") and `serverless` together is taking this to a new level of ridiculousness.
Kidding aside, how would you suggest we re-phrase this to make it clear that we think that arbitrary Docker containers + conformance to the Knative spec (which you can run anywhere) is a clear differentiator?
As for running regular Docker/OCI containers, I would put more emphasis on how awesome this is for developers:
It's just regular HTTP in Docker containers. You can debug and develop this locally without mocking anything, and it's the same code that runs in production!
And it's all open source. No need to install a Lambda emulator that is almost, but not quite, like the real thing.
I never even considered using AWS Lambda due to the lock in and how annoying it is to work with, but this is something I'll evaluate (by trying it on my local k8s cluster).
From my outsider's perspective, serverless is something that just recently came into mainstream usage. People who this product is made for probably see this differently and thus wouldn't have a problem with the formulation.
Language pedantry aside, good luck with your product launch :)