Hacker News new | past | comments | ask | show | jobs | submit login
Cloud Run – Newest member of our serverless compute stack (cloud.google.com)
201 points by wilsynet on Apr 9, 2019 | hide | past | favorite | 137 comments

To Google people, do you have any suggestions for startups how to start using your services for production workloads and: 1) Not having to take the $250 monthly fee per dev that needs to be able to contact you. Just increases the start cost massively for your services.

2) Not having to worry about being hell-banned and not being able to restore your account with Google, such as "guilty-by-association" for devs in PlayStore/gApps etc trickling over to closing down the whole account. We have read so many of those stories here on HN lately that I'm terrified to use Google for anything business related as we run the risk of losing everything.

The Cloud Run product looks great and would love to use it but the above two issues makes it a no go for us.

Have you reached out to: https://cloud.google.com/developers/startups/

They should be able to give you a pretty large chunk of credit (I assume it'll cover support as well). It should also give you folks to contact and help prevent the latter issue.

Thanks, however, it seems limited to 12 months only, which is usually the initial prototyping phase of a product prior to launching. Also, feels strange to have to be in some type of program and beg for mercy with a contact person in order to alleviate the extremely strange behavior of permanent cancellation by algorithms. What if the contact person we have gotten to know have moved on to a different business etc? Feels unreliable.

If you purchase Google production support, with credits or otherwise, it's available 24/7 and not tied to a single "contact person". In addition, billing support (which includes things like being suspended for fraud or abuse) is free, available to all and also 24/7. All these channels are responded to by real live humans who can see what "the algorithm" did and why, and overrule it if need.

Disclaimer: I work at GCP and used to work in support.

I believe you are incorrect: https://cloud.google.com/support/ states $250/month/user. Hence, it seems to be tied to a single contact person, why else would it state pricing per user?

Regarding suspension, if this is the case, why do we see so many desperate cries for help from people whom are unable to get in touch with anyone at all at Google to help them with their termination? They have to resort to public blog posts and hope that a good Samaritan at Google comes along, sees them and acts.

They all have the same story, "couldn't get in touch with anyone, "tried to appeal, denied without explanation" and they all had several days if not weeks of downtime.

Ah, I see what you mean: that's referring to users on the customer's side, not Google's. You also can adjust the number & role of users on the fly, so you can create another production support user if somebody else needs to file a ticket urgently.

From what I've seen, the stories about people having trouble tend to be from Android app developers getting blocked from the Play Store, not GCP users.

I can understand the concern about getting your account banned, but: if after 12 months, your startup can't afford $250/month for support, you probably have bigger problems.

To put it another way: do you think the level of support that Google would be able to provide is less valuable than a few percent of an engineer's salary?

Google support seems to charge per person, so suppose you are a few persons, ie at least 3 sharing the support then it's $750/month. Also, you are assuming US salaries, when looking at many other countries $250 is upwards 10% of the salary. Google support is valuable, but, going with AWS for example is a lot cheaper. Now, if you are a VC-backed or otherwise funded company, then this is a non-issue, but there are hundreds of thousands of businesses that have a total spend on infra that would be less than the support fee from Google.

That categorization dismisses a host of evenings & weekend part time bootstrapping startups.

GCP Community Slack is a great place to get access to both Googlers and non Googlers using and understanding platform.

I recommend starting by asking your questions on StackOverflow or other public forums. Many googlers (including me) are very active on StackOverflow and monitoring specific tags. For Cloud Run, please use `google-cloud-run`.

Hell-banning is by far my biggest fear with Google. The idea that you could be banned for life without any explanation just seems completely insane. It feels dystopian.

A couple of direct recommendations to combat this fear.

1. Sales teams can help here. You have a sales rep that you should meet/interact with so that if something like were to happen, you have a person you can reach out to. There has been a lot of news about hiring within sales.

2. Set up a proper Cloud Identity domain and organization. Manage users this way as opposed to consumer accounts.

3. Set up an invoiced billing account

I responded further down but on point 2:

A couple of direct recommendations to combat the fear of being banned.

1. Sales teams can help here. You have a sales rep that you should meet/interact with so that if something like were to happen, you have a person you can reach out to. There has been a lot of news about hiring within sales.

2. Set up a proper Cloud Identity domain and organization. Manage users this way as opposed to consumer accounts.

3. Set up an invoiced billing account

I'm so excited about the App Engine updates. I'll be signing up for the Ruby 2.5 run time to replace Heroku in our stack.

> Today, we are announcing support for new second generation runtimes: Node.js 10, Go 1.11, and PHP 7.2 in general availability and Ruby 2.5 and Java 11 in alpha. These runtimes provide an idiomatic developer experience, faster deployments, remove previous API restrictions and come with support for native modules. The above-mentioned Serverless VPC access also lets you connect to your existing GCP resources from your App Engine apps in a more secure manner without exposing them to the internet.

Posting here as well in case folks aren't reading the other thread: https://cloud.withgoogle.com/next/sf/sessions?session=SVR209

I am wondering if Shopify were testing it. Cloudrun seems to be exactly what Shopify were doing by themselves, and they were hosting on GCP.

Hey folks, one of the Cloud Run PMs (along with @steren, @ryangregg, @lindydonna, @stewart27, and others).

We're super excited to announce Cloud Run and Cloud Run on GKE, both implementing the Knative Serving API. Please let us know if you've got any questions!

I'm about to launch an API that uses Cloud Functions for the entire app (it's a pretty simple API anyway). I know that Cloud Run gives me more features (being based on containers and all), but given that my API already runs on Cloud Functions, would I get any performance benefits by moving to Cloud Run?

Does the Cloud Functions solution meet your current performance requirements? If so, don't worry about moving it.

The main benefits you'd see immediately are toolchain (e.g. Docker containers, existing build systems, etc.).

I haven't launched it, so I'm not sure about performance. My main concerns are cold starts and concurrency. It's my understanding that Cloud Run has higher concurrency per container instance, so my guess would be that Cloud Run would give me fewer cold starts than Cloud Functions. However, since Cloud Run is a generic runtime, I'd imagine that cold starts there would be on the scale of seconds compared to milliseconds for Cloud Functions.

Cloud Functions PM here.

Your intuition around concurrency is correct: Cloud Functions has "per instance concurrency" of 1. Cloud Run lets you go significantly higher than that (default 80). This means that our infrastructure will generally create more instances to handle a request spike when using Cloud Functions vs. Cloud Run.

Creating an instance incurs a cold start. Part of that cold start is due to our infrastructure (generally this is small) but the other part is in your control. For example: if you create a client that takes X seconds to you initialize, your cold start will be at least X seconds. The initialization time will manifest as part of your cold start.

This has a few practical implications:

* writing code for Cloud Functions is generally more straightforward as single concurrency solves many problems regarding shared variables. You may also see some benefits in terms of monitoring/metrics/logging since you only need to think about one request at a time.

* you will likely see a higher incidence of cold starts on Cloud Functions during rapid scale-up, such as in response to a sudden traffic spike

* the impact of a given cold start will depend heavily on what you're doing in your container

* though I haven't validated this experimentally, I would expect that the magnitude of any given cold start (i.e., total latency contribution) would be roughly the same on Cloud Run as Cloud Functions IF you're running the same code

Ah, thanks for the details there! So, given that my Cloud Functions project is a Go app (and would be the exact same code between Functions and Run), if I were to run that in a very minimal container (something like Alpine), I could get roughly the same cold start time as Cloud Functions, but fewer of them since I can respond to multiple requests using the same instance.

I'll probably do some experimentation on my end as well to test. Any suggestion how long I should wait between tests to ensure a cold start on both Cloud Functions and Cloud Run?

I think you can force cold starts between your tests by re-deploying your function/container. You could (optionally) leave a small buffer (<1 minute) after the deployment to ensure that traffic has fully migrated.

I spoke too soon. The deploy will bring up an instance instead of your first request. To force a cold start, you could set concurrency to '1' and send two concurrent requests. You should see a log entry such as the following when a new instance starts up: "This request caused a new container instance to be started and may thus take longer and use more CPU than a typical request." Alternatively, you could set up an endpoint that shuts down the server (which will shut down the instance - not advised for production code).

As an aside, the "K_REVISION" environment variable is set to the current revision. You can log or return this value to test whether traffic has migrated to a new version (instead of waiting a minute).

Yes on both accounts.

Disclosure: Cloud Run Engineer

I'd encourage you to test your particular app, but you should expect similar cold start times in Cloud Run.

You can set "Maximum Requests per Container" on container deployment so you are in control whether a container has single concurrency (i.e. "Maximum Requests per Container = 1"). If your app is not CPU-bound and you allow multiple concurrent requests (the default) you should see fewer cold starts.

Thanks very much! Could you answer the following questions about cold start times in Cloud Run or point me to a good resource:

1. I think I have a pretty good understanding of what's going on with the lifecycle of Cloud Functions that leads to the cold start times. What happens with Cloud Run? Does it need to download the whole Docker image to a machine to run it? Seems like that would take longer. 2. App Engine has 'warmup requests', which I think are great. Is there any equivalent on Cloud Run, or plan to add? 3. Is the time that an instance is kept warm during idle similar between Cloud Functions and Cloud Run?


1. Both cases grab the image and run it. Better per-layer caching (including very aggressive caching of common layers) is coming soon, so stay tuned. 2. No current equivalent, though there are thoughts on exposing more scaling control knobs (e.g. max-instances, min-instances). Max is easy, min is harder because of the cost implications. GAE was billed on "instance hours" but Run is CPU time, so if you go "min-instances=1" you're paying for a VM. Something like Run on GKE (where you're already paying for the compute) probably makes more sense to expose these controls. 3. Yes, though since Run can be multi-concurrent, for certain (most?) load profiles, you're going to have way fewer cold starts because the instance is already handling requests.

Just wanted to say thanks to you and the rest of the GCP crew for being all over this thread. Most appreciated!

Cloud Functions only supports single concurrency, one request per instance at a time.

Cloud Run supports multi concurrency by default. You might see some benefit due to that difference (fewer cold starts, less CPU time allocated)

Just tried creating a service with the sample image and love the quick deployment time compared to AppEngine Flex.

I noticed that the console shows a traffic percentage for each revision, but no apparent way to change it. Any plans to support traffic splitting, or at least a one-click way to re-activate (or re-deploy) an older revision?

Yes, there are plans for traffic splitting.

Currently Cloud Run only supports the `runLatest` mode of the Knative Serving spec (https://github.com/knative/serving/blob/master/docs/spec/spe...) but we're working on other modes (e.g. `release`) which would allow for traffic splits.

When would you use Cloud Run vs App Engine Flex?

Doesn't App Engine Flex essentially help you run an app in a container and handle scaling too?

For example, what if you're running a stateless JVM app like something on Play Framework. Could you run it on either App Engine Flex or Cloud Run, and if so, what considerations would there be for choosing one over the other?

0. Flex doesn't scale to zero, Run does.

1. Flex runs directly on VMs (sharing some GCE networking, access to GPUs, etc.), Run doesn't.

Those are the two that come up top of mind. We're working on a more comprehensive "choose your compute product" walkthrough in the near future.

Comparison doc in the future would be great!

It sounds like there's a bit of overlap. If I'm not up and running yet and just want the most managed solution (least work to run, with the most powerful high level features), is one of these a clear choice?

In my case, I don't care as much about low level access, but I just want my app to run quickly and efficiently. App Engine Flex's ability to run multiple services within an application, get load balancing set up, have versions of the app for testing and rollbacks, etc seems great. However, overall I just want to the easiest way to run my stateless set of services.

Can you provide some information about when you might be adding support for IPv6? It's a deal-breaker for our particular use cases.

Can you clarify what is not working for you? IPv6 should work. (I work on Cloud Run).

Sorry, my question was about outbound IPv6. At the backend, we need full IPv4 and IPv6 network connectivity to the outside world. (I haven't tried Cloud Run, but I read through the docs and there are no indications [that I could find] that IPv6 is supported, as with other GCP services.)

IPv6 TCP and UDP outbound works. :)

That's great to hear, thanks! Is there a page in the docs where this is documented? I'd also like to know if there are any restrictions (e.g., are all outbound ports open, etc)?

If you could comment on when/whether compute instances will get IPv6, that would be great also :)

Hello, I had a question about how Cloud Run handles security between containers being run on the cloud.

I assume my container runs alongside other containers on the same host, how do you prevent privilege escalation exploits?

Cloud Run engineer here.

Containers are run in gVisor https://github.com/google/gvisor as the container sandbox.

Cloud Run prevents privilege escalation by using gVisor as sandbox technology. Each container has its own isolated user-space kernel from the host.

(Disclaimer: I am Cloud Run dev)

I'm a student, mainly using GCP for hobby/hackathon projects, and to teach myself.

I am massively excited about Cloud Run's free tier---for someone with a budget of zero, being able to get a project off the ground and functional without paying through the nose if you forget to turn down some service is incredibly useful. Getting an unexpected $40 App Engine bill at the end of the month isn't fun.

I'm definitely going to check Cloud Run out as a go-to for future projects--it looks like a really good fit for my use case.

We're super excited to see what you build :D

Even silly things like https://howdelayedissfo.com (built over the weekend with Cloud Run and Firebase Hosting) are great ways to improve your skills without breaking the bank (projected costs are well within the free tier on both products).

Does cloud run have the same, better, or worse cold start latency compared to cloud functions?

Cloud Functions is single concurrency, meaning every new request causes a cold start (generally speaking).

Cloud Run can be single or multi-concurrency (defaults to 80), meaning one in 80 requests globally causes a cold start (again, generally speaking, assuming the container isn't CPU bound below that, etc.).

So, if both are running in single concurrency mode, you'll likely see similar cold start time; however, since you can go multi-concurrent in Run, you'll likely see better performance, especially when scaling up.

A nice side effect of this is that your costs will also drop, since you can pack more requests into your instance.

Edit: as noted in the other thread, because you control the container image, you can optimize that (e.g. use Alpine instead of Ubuntu) to reduce weight and decrease cold start time. But if you're using a language like Java, it's possible that starting the JVM and framework is going to be more expensive than the OS load time, so it might be a wash.

Sorry, just asked a similar question in response to a different comment, but just saw this detailed response. Do you know if there are any plans to add 'warmup requests' to Cloud Run like in App Engine?

Replied there!

can your running container instances mount a filestore nfs endpoint?

Not yet, though I'm looking into both Filestore as well as GCSFuse.

The short answer is that read-only mounts are going to be super easy, but read-write is going to get really interesting (especially since there can be N instances all trying to write).

Any chance there will be cron jobs for functions? Right now, I'm using AppEngine just to schedule a function to run.

Check out Cloud Scheduler (https://cloud.google.com/scheduler/) which can target an arbitrary HTTP URL, and will support authenticated push to securely target Cloud Functions and Cloud Run (I think this is going to public beta this week).


You can also set up Cloud Scheduler to push to Pub/Sub, which can trigger your function. This is helpful if you don't want your function to be available via a public HTTPS endpoint.

Note to Firebase team: I use exactly this (a Cloud Scheduler job that pushes to a pub/sub topic, and then set a Firebase function that runs on a topic trigger) to schedule functions, but would be really nice if I could just create a triggered cron function like this:

functions.cron.schedule('0 0 * * *').onSchedule(...)

or something like that. Even if it behind the scenes it just did exactly what I'm doing manually now, I see so many questions about cron-triggered functions that it would save a ton of people time searching for what the best route is to take.

We are working on this right now, so stay tuned :)

Does max. RAM usage need to be specified upfront, or what's billed is the live-allocated amount?

"You are billed only for the CPU and memory allocated while a request is active on a container instance, rounded up to the nearest 100 milliseconds."


You can configure the amount allocated: https://cloud.google.com/run/docs/configuring/memory-limits

Ok thanks. I'm looking for a way to run a service with memory peaks - 50pct is 200MB and 99pct is 2GB. It looks like no current serverless solution would handle it without overpaying by allocating 2GB for all executions.

Memory is a difficult one, especially in garbage-collected languages which have a habit of filling up the heap even when it's not used, so it's not always obvious how much memory is actually being used without having language/runtime specific signals. The mitigation in Cloud Run is both concurrency, and that you're only billed while a request is active.

(Disclosure: Google Cloud PM)

> Memory is a difficult one, especially in garbage-collected languages which have a habit of filling up the heap even when it's not used, so it's not always obvious how much memory is actually being used without having language/runtime specific signals.

I'm not sure what do you mean here, Cloud Run uses Docker which runs regular processes in a cgroup, so it's sufficient to check the cgroup memory usage, right? Yes, Java can always use large heaps but we're running Python and C++ where a process' memory usage directly relates to what a program allocates (even PyPy with GC has this property).

> The mitigation in Cloud Run is both concurrency, and that you're only billed while a request is active.

When there are memory peaks, larger deployments without container-level concurreny look better. For my example 16GB of RAM allows running 8 containers to get a chance for a 2GB task to complete, but on average 90% of the memory will be wasted. On a single 16GB server I can run 48 tasks with 40% wasted and a high chance of the 2GB tasks finishing. Yes, in this scenario I must handle tasks killed due to OOM but the difference in throughput is so large that it's worth it.

> Cloud Run uses Docker which runs regular processes in a cgroup, so it's sufficient to check the cgroup memory usage, right?

Cloud Run uses gVisor as its container runtime.

Disclaimer: I work on GCP but not much on/with Cloud Run.

Cloud Run is built upon Knative so it runs inside k8s (which uses CRI API of a backend, Docker or any other alternative) which handles OOM by inner resource manager, not cgroups.

You can use Cloud Run with GCP-managed infrastructure, and also inside your own GKE cluster.

The GCP-managed infrastructure exposes the Knative API, but doesn't actually run in GKE/k8s.

(I work on GCP but not much/on Cloud Run. Above is correct to my knowledge, but I'm not an expert.)

Good to know, thank you.

Yes, it needs to be specified upfront.

The default is 128M and you can specify up to 2G

Hi Mike, this looks great. I took it for a test-drive with some OpenFaaS functions. https://www.openfaas.com/blog/openfaas-cloudrun/

Can you compare Google Run to App Engine Flex? They sound similar.

Cloud Run uses gVisor for its sandbox, Flex runs your container in a dedicated VM.

Cloud Run has a limit of 80 concurrent requests to a copy of your app. Cloud Run has a limit of 2 GiB for memory.

Flex gives you more flexibility over VM shape (CPU, mem), so is suited better to apps that have a higher load. Flex does not scale to zero.

Cloud Run deployments should be faster (Flex provisions a load balancer for each deployment, which can be slow).

Both support deploying directly from container image.

Disclaimer: I work on GCP but have only worked a tiny bit on/with Cloud Run.

thanks for all the helpful comments here! we run a rails app. if we want to replace heroku with a google cloud service, would cloud run be the best one?

I'm doing a talk at Next tomorrow about just that. For now you are probably better of using Ruby on App Engine, assuming you need to use redis as a cache. Another new feature that we launched today is VPC Connectors, which allow Cloud Functions and App Engine to connect to servers/services in a VPC.

Cloud Run will have support VPC Connectors soon (it is supported, we just haven't wired up the API/UI). After that, its your choice, they run on similar infrastructure so you just need to decide if you want to live in containerland or source code land.

I'll post the example source code here tomorrow after the talk (still tweaking it).

OK thanks! How about persisting files (which actually isn't possible on Heroku but we would like to persist files on disk for simplicity so are debating about just using AWS)?

We provide an in-memory writable filesystem (e.g. you want to do file transformations) but it's not persistent across all instances. We're looking into Filestore integration for a mountable NFS product, though as mentioned in another comment about that, reading is easy but writing is hard.

Why is writing hard? We already use Filestore across multiple VMs in read-write mode. We ensure that writes are unique i.e no two machines will use the same filename. There are also no file modifications.

Is knative build api supported as well?

My understanding is that we're working on having Cloud Build support Tekton [1,2] (which is where the Knative Build API now lives)

[1] https://tekton.dev/ [2] https://cloud.google.com/tekton/

From a developer perspective, deploying a docker container with a plain http server is much more appealing than the hoops you have to jump through to use the lambda custom runtime stuff. I hope this gets AWS to provide a docker on lambda option.

> AWS to provide a docker on lambda option

isn't that AWS Fargate?

Yes and no--Fargate is a Docker container which runs on an ECS cluster that you don't have to manage, sure, but it doesn't scale down to zero. As far as I know, there's no support for running an HTTP endpoint and then having the container start in response to a request coming in. You could build that yourself (although I suspect it would require running some long-lived infrastructure, defeating the purpose of scaling the Fargate service down to zero) but I think the cold start times for a Fargate container would be prohibitive--maybe it gets better once you've already scaled up, but in my exploration I've seen Fargate take 45-70 seconds to run a new container. I suspect this is due to Fargate running in your VPC and therefore probably requiring a network interface to be created before the container can be ready.

The exciting part of Cloud Run for me is not that I don't have to manage a Kubernetes cluster, but that I don't have to pay for it when my service is sitting idle.

Looks like AWS Faragate pricing is per invocation and duration of tasks.

> Pricing is per second with a 1-minute minimum. Duration is calculated from the time you start to download your container image (docker pull) until the Task terminates, rounded up to the nearest second.


A "task" in ECS means a container. In Fargate, in order to be ready to receive requests 24/7, you have to pay to have a container running 24/7. In that sense, it feels a lot closer to EC2 than it does to Lambda.

No. Lambda charges you only for when your application is actually handling request (or other events). That isn't an option with Fargate.


Details the contract your container must meet to run on the service and limitations you might run into

Quite a nice document to be fair.

Would be interested in hearing what use cases people would use this for, over say Cloud Functions or AWS Lambda. I'd imagine the flexibility of being able to run anything that supports HTTP is quite attractive.

One thing I did spot though is the container instance is limited to one vCPU, I wonder if people will hit performance ceilings? For lambda they abstract the CPU allocated based on the memory tiers, I'm not sure if this is the same though

Oh! It's great documentation, sucint, clear, to the point...

FYI - thread from yesterday around the docs and concept: https://news.ycombinator.com/item?id=19610830

This appears to be the accompanying blog post. So, not a dupe and explains the higher level goal here. The post from yesterday, was before the product was actually publicly released, if that gives more context as to why this is new news.

FYI - I just posted a review video where I walk through Cloud Run doing some demos. Check it out at: https://sysadmincasts.com/episodes/69-cloud-run-with-knative

Serverless with gRPC would be awesome. There's grpc-gateway, but native support would be even better.

Surely it won't take long - the product itself has a gRPC API: https://cloud.google.com/run/docs/reference/rpc/

Depending on your use case, there is a way to proxy gRPC to Cloud Run in a slightly hacky way leveraging the fact that outbound gRPC works.

You can run in GCE a gRPC server that whenever it gets a gRPC request, it temporarily stores the gRPC message and associates it with a session ID. It then sends a HTTP request to Cloud Run with that session ID. Then your Cloud Run instance will take that session ID to make a gRPC connection to your gRPC server in GCE. This GCE instance will then take the session ID, retrieve the gRPC request, and forward it to the Cloud Run instance.

This is admittedly hacky, but depending on your use case, may be good enough.

Streaming HTTP and gRPC support is on our roadmap, but it's still a ways out :(

On the comment about how the control plane APIs support gRPC: the serving infrastructure for the data plane is pretty different from the control plane, so it's unfortunately not a direct map to supporting gRPC on the data plane.

Another possible solution is to run an API gateway that does gRPC to JSON conversion and have that invoke your service.

Unary gRPC calls would cover all of my uses cases that would benefit from this product. Is that any closer on the roadmap?

My understanding is that the major issue we have is with streaming (sticky sessions to backends that are autoscaling is tricky), so I assume that unary would be easier. I admit that I haven't dived too deeply in to this though.

EDIT: looks like we could do unary gRPC if gRPC wasn't only supported on HTTP/2. So the current implementation basically requires we solve sticky sessions/streaming.

No gRPC support means no server-side Firebase integration, right?

We don't support inbound gRPC/streaming, but we do support outbound gRPC/streaming, so things like Firestore (or other streaming gRPC products) will work fine.

Oh, nice. Thanks for the clarification.

What does "Cloud Run on Google Kubernetes Engine" mean? Is there anything beyond the idea that a plain docker container can run on Cloud Run or on GKE?

Product manager for Cloud Run on GKE here.

Cloud Run on GKE brings a managed experience for Knative/serving and Istio that aligns with Cloud Run. We install and manage the Knative version in your cluster and keep it running for you. Above what you get with base Kubernetes ability to deploy a container, Cloud Run on GKE gives you request-based auto-scaling of container instances, network programming with Istio, and Stackdriver integration for logging, monitoring, and metrics. You also get the Cloud Run UI and CLI to deploy, manage, and update your services.

One of our core goals with Cloud Run is to enable serverless portability; not just for workloads, but tooling and developer experiences. By offering Cloud Run in both a hosted and Kubernetes platform, both enabled by Knative, you can leverage compatible serverless tooling and knowledge across the range of platforms from hosted to GKE to k8s anywhere.

Does this mean that by using Cloud Run GKE I can also leverage the primitives provided by Istio and Knative Serving?

Cloud Run on GKE is Knative (which relies on Istio) installed on your GKE cluster, and updated by us. So yes, you can go and muck around with lower level Istio/K8s primitives, but only to a point (there are places where you can break Cloud Run on GKE if you configure things particular ways). Over time we're working on making those actions more clear and ensuring that you get all the benefits of the Istio mesh built in, plus the ability to reconfigure to suit your needs (e.g. adding Istio RBAC like we have with Cloud IAM on Cloud Run).

Gotcha! So, things like adding Knative eventing which uses serving should be fine while using Cloud Run?


I am so glad somebody asked, 5 sec into the page and I had to Google what this Google is all about. It seems we have new meaning being fitted into the word Cloud and Serverless every 5 months and the page was written by marketing team.

This post [1] explains it a lot better. Quoted from the doc. ( Why they didn't include it in the page is beyond me )

"Cloud Run is a managed compute platform that enables you to run stateless containers that are invocable via HTTP requests. Cloud Run is serverless: it abstracts away all infrastructure management, so you can focus on what matters most — building great applications. It is built from Knative, letting you choose to run your containers either fully managed with Cloud Run, or in your Google Kubernetes Engine cluster with Cloud Run on GKE.". So, like lambda but with containers. Pretty much what I have wanted since I first started learning about Docker.

[1] https://news.ycombinator.com/item?id=19611194

It means that we support the same toolchain across Cloud Run (running on our fully managed infrastructure) as well as Cloud Run on GKE (running on your k8s cluster). For example:

``` # deploy to the fully managed product gcloud beta run deploy --image gcr.io/foo/bar --region us-central1

# deploy to a GKE cluster gcloud beta run deploy --image gcr.io/foo/bar --cluster myCluster --cluster_location us-central1-a ```

Cloud Run on GKE specifically adds the Knative CRDs (e.g. Knative Serving) to your GKE cluster, giving you additional autoscaling and different rollout primitives.

Thanks Google cloud team this looks really great. Quick question: Is Cloud Run a zonal or regional resource and is there any way to put this behind the global load balancer? Thanks.

It's regional (and currently only in `us-central1`, though more regions are on the way). We're working on GCLB integration.

Great good to know. Thanks.

We at Cloudrun [0] are excited to try Cloud Run for scaling our platform! :)

[0] https://cloudrun.co/

While listening to Cloud Run on Spotify https://open.spotify.com/artist/4gCabDwZ2AdhrBYYWQT9t2?si=tL...

Is there info on cold start times for various languages, or configurations? Lambdas, for example, are pretty slow starting if they need connectivity into a VPC, or use Java.

I’m also curious about cold start performance, latency.

Wanted to drop in and say that I've seen this and am looking for any data that we're willing/able to share on our internal benchmarks.

For hosting a Firebase app, with the hosting rewrites (like: https://firebase.google.com/docs/hosting/cloud-run), what regions is this available in?

Is it just us-central?

Currently just `us-central1` since that's the only region Cloud Run is supported in.

Can Cloud Run work with GCP Memorystore (managed cache)? Also, one limitation of GCP cloud functions was the ability to communicate directly to a private network VM, although a solution is in beta I think. Does Cloud Run have this limitation?

We'll be adding VPC Connectors to Cloud Run shortly, which will let it talk to Memorystore. As you mentioned, they went Beta for GCF and GAE today.

I have so many batch jobs that run on GCE. Can I run some of them on Cloud Run? I can only run HTTP services now. My use case is very similar to AWS batch.

Currently, I use GCE with create-with-container and Cloud scheduler to manage batch loads.

Cloud Scheduler can trigger a Cloud Run service. And this Cloud Run service can run anything, including bash scripts. Take a look at the "Shell" tab in https://cloud.google.com/run/docs/quickstarts/build-and-depl...

It would be really really amazing if it allowed a large number of CPU cores. Like 64 cores. That would make it interesting for ML inference workloads.

For 64 cores, I recommend leveraging custom machine types with Cloud Run on GKE. We are working on more CPU sizes for Cloud Run, but not in a near future for more than 2 vCPUs.

But users want the per-second billing and fast cold start times too.

I understand this is kind of a hard CS problem and is basically rooted in needing to move a lot of data around very quickly, all while making the software low latency as well.

But solving these is kind of the point of a FaaS.

Otherwise we could just run containers on VMs and autoscale ourselves. With terrible cost and cold start times.

Maybe look at how Jelastic does it for a unique take on this. https://jelastic.com/public-cloud-pricing/

This is probably the model that would make users the happiest.

Having poked a bit at how Jelastic does it, it seems they drop a pool of users on a 32-core machine and load balance the users containers with live migration to less utilized machines.

Imagine GCP doing something similar. Drop a big pool of users onto 96 vCPU instances. If an instance starts to get overutilized, live migrate some user containers to a new instance.

Same type of thinking that is probably behind AWS Lambda: how do you satisfy bursty workloads for a lot of users cost effectively? Pool the users. Assume they won't all burst at once.

That has got to be one of the biggest benefits to large public cloud computing.

Pooling already happens, AFAIK. On AWS at least you have to pay extra to not be pooled with other customers (ie. for exclusive use of a physical machine, regardless of your VM type).

Not just cores. But memory currently is capped at 2GB.

Is there support or integration for the new GCP web application firewall/security product?

I might have missed a product announcement today, but is this Cloud Armor (https://cloud.google.com/armor/)?

If so, no, though we're working on supporting Cloud Armor, CDN, Load Balancing, etc.

Yes, thanks, Cloud Armor is what I meant.

Finally a serverless technology that is actually serverless to the developer ($0 when idle).

Any ETA on being able to access Cloud SQL from Cloud Run containers?

Curious if websockets, sticky sessions are supported?

Not on Cloud Run, though you can do this on Cloud Run on GKE.

There are some interesting billing implications of persistent connections (e.g. we don't know what's going on in that session [if there's traffic or not]), so we'll likely have to bill you the entire time the connection is open.

"Traditional serverless offerings come with challenges such as constrained runtime support and vendor lock-in. [...]"

I'm quite aware that there's a technology hype cycle in web development, quickly replacing last year's fad. But whoever came up with the idea of putting the words `traditional` ("following or belonging to the customs or ways of behaving that have continued in a group of people or society for a long time without changing") and `serverless` together is taking this to a new level of ridiculousness.

Five years is a long time in a world where front end frameworks seem to change every six months ;)

Kidding aside, how would you suggest we re-phrase this to make it clear that we think that arbitrary Docker containers + conformance to the Knative spec (which you can run anywhere) is a clear differentiator?

There's a whole section on "Enabling portability with Knative" and even though I only skimmed the page, I got the message that this is an open standard just like Kubernetes.

As for running regular Docker/OCI containers, I would put more emphasis on how awesome this is for developers:

It's just regular HTTP in Docker containers. You can debug and develop this locally without mocking anything, and it's the same code that runs in production!

And it's all open source. No need to install a Lambda emulator that is almost, but not quite, like the real thing.

I never even considered using AWS Lambda due to the lock in and how annoying it is to work with, but this is something I'll evaluate (by trying it on my local k8s cluster).

I would have been probably less irritated by something simple like "current serverless solutions". But I'm not a web developer and don't understand the differentiation details you mention, so I'm not in the target demographic here.

From my outsider's perspective, serverless is something that just recently came into mainstream usage. People who this product is made for probably see this differently and thus wouldn't have a problem with the formulation.

Language pedantry aside, good luck with your product launch :)

"First generation"? Archetypal?

I think "the entire time this thing has existed" is plenty long.

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact