Hacker News new | comments | show | ask | jobs | submit login
Extending per-second billing in Google Cloud (googleblog.com)
154 points by boulos 9 months ago | hide | past | web | favorite | 82 comments

The primary driver that's going to get me to switch to Google Cloud from AWS is their simplified pricing. AWS's reserved pricing is fraught with inefficiency and prone to manual errors where it looks like they've intentionally made it hard to determine what's covered by reserved pricing.

Everything else in AWS looks to be automated except for their pricing which I'm assuming they're avoiding adopting an automated sustained usage model so their prices appear cheaper then they are by taking advantage of manual inefficiencies from trying to map reserved pricing to services used.

If you have time, please elaborate on what you'd like to see.

A "no-touch" effort-free pricing model similar to GCP's sustained usage where the hourly rate automatically drops based on continuous usage for each month.

Reserved pricing is an unnecessary waste of time/effort turning us into manual book keepers where we have to try carefully match our reserved pricing plan to each service we use. It's time consuming and frustrating when you purchase the wrong reserved pricing SKU or are not notified when you have instances not covered by a reserved pricing so you end up paying the highest rate despite it running 24/7.

It's frustrating enough that I've started moving servers that don't require access to other AWS services to hetzner.

...and EC2 reservations are still a Godsend compared to some other offerings out there, I mean at least they're interchangeable by node class:

We've got Redshift reserved instances and due to growth, we need to upgrade our node type soon (even in the same class!!). But because you can't reuse, recycle or sell the reservations between node types, we have to let all of our small node reservations expire before, complicated by the fact they were already bought in several layers.

There's no reserved instances market for Redshift (all of this looks like not very well thought out from the beginning) and it took us hours to finally come up with a semi good plan to do this (which will still cost thousands of dollars in lost expired instances). All of this was completely unproductive time in regards to our product, so the antithesis of what AWS stands for.

Managing financial matters on AWS is such a royal PITA, I'm so glad we switched 90% of our stack to Google.

They even give you a big fat yellow warning sign directly in the console: "Want so SAVE money? One click here and we'll resize your instance online instantly." Leave it on for a month and get nice discounts automatically. It's heaven.

(Yes, BigQuery. I'd LOVE to, however there's a lot of very good reasons we're still on Redshift...)

An exact copy of Google's compute billing system would be a step in the right direction. It is predictable, straightforward, and customer centric. Frankly what you would expect if Amazon retail designed it.

Update: I realize that their ability to offer this pricing schedule also has to do with their specific technical infrastructure and migration capabilities; but it's clearly not impossible to make these features a reality since it's been done already.

IMO it can be as simple as choosing a different pricing tier based on the number of instance hours billed.

I think dropping the minimum charge to one minute will turn out to have a bigger effect than the per second billing. I build app images with Packer, and these generally take a few minutes, but I'm charged for the full ten. It's nice to know only pay for exactly what I'm using.

It also opens up opportunities for spinning up a fresh VM for every CI build. GCP startup latency is pretty good, so this might be doable for some CI systems.

Even though a GCE instance can be made to boot pretty quicky it's a bit sad that it takes around 40-50 seconds before any network connectivity can be established.

I asked a question about it here a while ago https://serverfault.com/questions/845298/no-network-connecti...

Checked again just now, approx. 40 seconds between pressing START in the console UI and until I can ping, using serial console on Debian Stretch.

If anyone have a clue how to improve this I'd be most happy to hear about it. I'm using this for CI builds.

We're working on it. It's a major initiative for us, because as you see we get the damn thing "booting" in a handful of seconds and then reachability to/from the internet is the long pole. Fwiw, we at least got to/from *.googleapis.com way down, so if you need to say fetch something from GCS, that should be a bit faster.

Disclosure: I work on Google Cloud.

Curious (if you can go into it) what the current technical hurdles are in getting the connectivity times down, and/or what you did to get the internal connectivity going quickly.

At a high level, it's the result of having global, flat Networks and not wanting to declare the network "up" until you've "programmed" all the routes. So if you have 1000 VMs distributed globally, you get to make sure that they're not "connected" until your new VM in asia-east1-a can talk to all other VMs in your Network (and vice versa).

With the to/from API path this routing is much simpler since you don't get the N^2 behavior.

Again, Disclosure: I work on Google Cloud (but I've never contributed engineering-wise to Cloud networking).

Would that connectivity happen regardless of firewall rules and such (thinking network tags specifically)? Like if I have 1000 VMs as in your example, and I spin up a new box but have a tule that says only one of those 1000 can talk to it, would the route programming still go through and assert that connectivity to all 1000? Or would it be smart enough to realize that only one of those needed to connect and finish more quickly? If the latter, what happens if I then remove that firewall rule immediately after boot?

Today's control plane is fairly smart about which routes are required for a given config change event (there's a lot one could speculate about here, especially around the word "required") -- fwiw, I also don't work on the control plane -- I have spent a lot of time on the GCE hypervisor network dataplane. So a fair amount of hand waving follows -- just assume details are missing because I don't know what they are :)

We aim for global convergence of network state as sort of an ongoing goal, but it's a distributed system with failure domain isolation, so that goal is necessarily flexible. There are different rates of convergence, and Solomon is certainly right that routes to first party services are some of the easiest to converge. Internet connectivity is to a certain extent the hardest thing to converge, as in our premium tier (which up until recently was our only tier) we aim to keep data on Google's network for as much of its journey as possible. At the extreme, this means a lot of edge nodes learning that a given external IP belongs to your VM.

Part of it is also just the mundane business of reconciling what a given configuration event means and propagating that to interested parties. With respect to your firewall example, it's really just another config change event with some set of implications for routes that are added or removed.

Anyway, that's my hand wavy explanation. Hopefully it's helpful!

OT: boulos/jsolson, thanks for being so open about the platform. Getting information like this has become rare, but it makes a very interesting read.


I love what I do, and I love talking about it to anyone who will listen. I find Google's infrastructure is fantastically exciting -- sometimes too exciting, rarely boring. Most of the time I wish I could share more. Always happy to hear someone has found what I can share interesting!

I'm not sure how well this would go over internally, but blog posts outlining cool stuff in the Google infrastructure would be really helpful to others, especially GCP users, to know how things work under the hood. I'd read that. And not even just success stories. War stories and "we tried this but it didn't work because X" stories are really valuable to others without Google's scale. Just a thought :)

> Fwiw, we at least got to/from *.googleapis.com way down, so if you need to say fetch something from GCS, that should be a bit faster.

Not used to GCE, what sort of timeframe is this expected in?

Kinda wondering what the expected time would be for a process that's something like:

Run container

Download something from GCS

null op

Upload to GCS


I know there's a lot of variables there, but does anyone have a quick idea of what this should be?

Wow really? Why does the stack take that long to establish end to end?

(Feel free to answer me internally :) )

Thanks, Very happy to hear this :-)

Take a look at the Cloud Builder if you haven't already. It's pretty nice, even if you don't build containers. There's a whole rack of undocumented stuff, but the people in the GCP Slack were able to point us towards things as we needed them. It has the property you describe, where it'll spin up a new VM for you.

What's the GCP Slack?

Google Cloud Platform. Sorry for not defining my acronyms!


I have a workable setup with Gitlab CI - my g1-small runner starts a preemptible vm on GCE with docker+machine, and using node: alpine & coreos its about 5-10s before its ready and running the ci job.

Specially interesting given other similar recent announcements, Google's per-second billing also covers:

| "premium operating system images including Windows Server, Red Hat Enterprise Linux (RHEL), and SUSE Enterprise Linux Server"

(disclosure: I work for GCP)

I'm completely unsure why that is worth special mention?

(disclosure: I don't consider what I do to be work)

The AWS announcement left the licensed things as hourly granularity.

Disclosure: I helped write this blogpost ;). (And I consider what Danny does to be work)

That's the only helpful use of the disclosure line I've seen in a few years .. thanks.

many employers request that employees provide a disclosure when discussing their products on social media

We don't, but generally do it out of an abundance of caution.

(disclosure: I work with the above two)

I think it's a pretty good policy.

Fake reviews/astroturfing comments can hurt a brand quite severely. The riskiest situation of course is talking about a competitor's product (which we kind of are here).

It doesn't have to sound like a legal disclaimer, but its nice to identify your relationship to the product may have (even subconscious) bias.

Hey fhoffa, got a second for a GCP question? Am I calculating App Engine Flex pricing wrong? From what I can tell, it starts around ~$40 / mo (1 vcpu, 512 mb ram) which is 5x over comparable Heroku hobby dyno instance?

No you are not wrong, and we are working on it. It's going to be a lot more powerful (3.7 GB, 1 full core) than a hobby Dyno, but definitely overkill for testing.

Until then, here is my shameless plug: https://medium.com/google-cloud/three-simple-steps-to-save-c...

Thanks for the update! Very excited to get going on App Engine.

Don't they have a free tier for hobbyists?

There's one for App Engine Standard, but not for Flex.

(Disclaimer: I work at GCP.)

What bugs me most for App Engine Standard is that under Java I need a custom ThreadFactory, when I want to use Standard Threads and use a Google Api. I need to copy the Environment into every new thread I create. Can't they fix that somehow? (Currently, if you want to call App Engine APIs (com.google.appengine.api.*), you must call those APIs from a request thread or from a thread created using the ThreadManager API.) (https://cloud.google.com/appengine/docs/standard/java/runtim...)

Basically the line is even incorrect since I can use my own ThreadFactory if I'm on a Request/Backend Thread and clone the Environment and copy it into each Thread.

Really enjoying this price and feature war between Amazon and Google. Now who wants to drop their egress pricing to be reasonable?

Working on it! We released our new "Standard" networking tier last month and switched to src x dst pricing that saved some people ~10-20% depending on the src x dst pair.

Disclosure: I work on Google Cloud.

Are you ever going to be competitive with transit? I can buy gigabit from a number of providers for $400/month, which saturated the entire time is 330TB or $0.0013/GB. I can also see 10Gbps for $999, which is $0.00030/GB.

There are so many things I'd love to use Compute Engine for but storage is obscenely expensive for my needs and egress pricing makes it impossible to get data out after I send it in.

Although the difference isn't as big as you think: when you buy from an ISP, you typically pay for a router (and space/power to host it), a $250/mo cross-connect fee, two ports if you want reliability, and you have a >2x unit price increase just from running a port at <50% average utilization. But yeah, it's still cheaper if you have >1Gbps of traffic.

One thing of note is now both Google Cloud and AWS charge at least one minute per instance (AWS used to charge a full hour, and GCP ten minutes).

Where'd you see 10 minutes? The very first sentence of the post says "one minute minimum."

He said it used to be 10 minutes.

(disclosure: The cake is a lie)

AWS wasn't first with per second billing. As the GCP blog states, GCP Persistent Disks had it since 2013.

(I work for Google)

AWS is miles ahead in IPv6 adoption though.

Why do GCP employees always post in HN articles about how much "better" GCP is than AWS or how GCP "did it first"?

Our company literally can't use GCP, full stop, because Google is completely ignoring IPv6 support. I heard load balancers are now in beta, which just terminates IPv6 and connects to the back end with IPv4.

When am I going to be able to make an outbound IPv6 connection from GCP? Specifically the Compute Engine VMs.

The IPv6 termination is now GA, but I agree with your sentiment (we're behind on IPv6). That said, it's a very binary thing: you either have IPv6 throughout the virtual networking stack or you don't. We're not ignoring it, particularly since Google cares so deeply about IPv6 worldwide, but we are behind.

Disclosure: I work on Google Cloud.

Edit: I'm not sure why you're being downvoted, it's a fair criticism.

Okay, sure, but those two things aren't related.

(Disclosure: I'm supposed to be working right now)

Can you ELI5 what we ppl who are deploying to any cloud should have as First Principles WRT IPv6 and why we should be architecting around it?

Disclosure: I like turtles.

Apple's App Store review has a hard requirement that everything works on a pure IPv6 network

The only thing that you have to do to satisfy this requirement is don’t hardcode IPv4 addresses in your app. If you are using domain names for everything then it doesn’t matter if those are only resolvable to v4 addresses.

Correct, using DNS on a IPv6 only network is enough since the point is to use DNS64/NAT64 to handle the transition until services are IPv6-native.

Please unpack this comment...

You don't necessarily need individual VMs to be IPv6 addressable to do this. That is, since our global load balancer already supports IPv6 termination, your app's API or website or whatever will be served as IPv6.

Take a look at our announcement from last week for more info: https://cloudplatform.googleblog.com/2017/09/announcing-ipv6...

Disclosure: I work on Google Cloud (so I indirectly benefit from you using our services)


I actually enjoy the extra information they give, and the clear excitement about what they work on. I've seen how the sausage is made at AWS and the people I've met that worked on it were extremely detached. This very much shows in the product.

GCP is behind on features, but I find the features work very well and are super nice to work with.

Seems like parent merely stated a pretty uncontroversial fact. And they are part of this community as much as anyone else is -- I at least appreciate the stakeholder participation and honest disclosure.

> Seems like parent merely stated a pretty uncontroversial fact.

Exactly. Here's one scenario:

> "I think this is a really fun movie, and I thought the ending was very smart (disclosure, I did some FX on the sequences with the Shnorp, and I have an affair with the producer's sister)"

here's another:

> "You are both right, actually: the running time for the US release is 94 minutes, whereas the Swedish contains the Shnorp nipple scenes which increase the total length to 96 minutes. (full disclosure, I was a roadie on the set)"

Guess what I find obnoxious and see many discussions filled with?

If it's in a dictionary, thanks for the fact, no I don't care who you work for or how "excited" you are and how "awesome" your team is and whatnot. It's so incredibly needy and tacky and seems the norm now; where the person calling it out actually gets piled on, while BS like this https://news.ycombinator.com/item?id=15342721 is fine. It is what it is.

Circle the wagons and just clap it out.

This is not bragging, it is an important piece of the information that informs you if someone commenting can be biased or self-promoting in a news, that their company is featured in.

We disclose our employer to be clear that while we are super excited about these releases, we're also trying to be as open and clear that our opinions come with a natural bias that others seem to rarely disclose.

There's no coordination or conniving to this, people just like rad stuff.

Also, I /wish/ I was a Compute Engineer.

I don't really have a problem with it. So long as they're on topic and relevant (i.e. not trying to sell GSuite in an AWS thread), it's fine. Certainly better than not hearing from them at all.

(disclosure: I also work for Google Cloud.)

I think put the disclosure upfront just make things clearer that we are not dragging. You already know where our butts sit and you can judge our words from pure-technical perspective.

Otherwise, people may say Google hire a bunch of ghost accounts just for promotion purpose.

I personally prefer the former.

I personally prefer to weave the fact that I work at Google right in the middle of the point that I am trying to make.

There are eleven words in your post to the left of 'Google', but thirteen to the right(like this post) so your post is lies.

my plot's been revealed!

Is it okay to have the comment section of a HN post be just one giant ad? The majority of the posts seem to be Google employees.

There is basically no discussion. Unless we are counting misleading companions between GCP products and their competitors as discussion. :-/

Disappointingly, there haven't been any material questions or comments (perhaps because it's straightforward). I think it would actually be inappropriate for us to have attempted to lead the conversation too much, so I prefer to respond to questions.

Disclosure: I work on Google Cloud.

I’m delighted their employees are active in the community. I wish more companies that are discussed here would engage with their audience.

This feels very much like copying the person who copied you. I am not sure that I like the lack of notice though (unless I missed something)

What is the issue with the lack of notice? While the net effect of this is pretty small for most use cases, it's only going to be cheaper, never more expensive?

(disclosure: he works for Amazon)


Genuine question: As far as I'm aware, there are no cases where moving to per-second billing would increase your bills, so what would be the point of providing notice?

He has Google stock and is concerned about cloud profits impacting next quarterly.

...why does it matter who copied who? The most common complaint about any provider is that they don't have something that a competitor does.

Also what lack of notice are you referring to? This blog post is the notice.

I assume he means that we enabled it immediately instead of charging more for a while...

Minimum 60 seconds = per minute billing.

Not per second billing.

Nice try google.

Must be downvoted by google employees.

It's a valid criticism.


we are pleased to announce that we're extending per-second billing across our platform.

we were first to do it.

it's actually useless (but we were first!).

let me show you why: ...

but we are forced to do it because of <unnamed competitor>

bunch of ads of why we are cool and doing "real innovation", compared to that stupid other competitor that forced us to enable per second billing.

we are still better than competitor so..


(disclosure: I don't work for Google)

Your move AWS. Google improves to per-second with a 1 minute minimum.

They announced last week :).

Disclosure: I work on Google Cloud.

Yes I know about AWS per second, I was being somewhat playful. Huge GCP fan. Keep it up boulos and the rest of the GCP team.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact