Hacker News new | comments | show | ask | jobs | submit login

Imagine for a moment, if they fudged the numbers a bit on the storage capacity of every box going into your home. They advertise the usable capacity of 1 or 2TB. In reality, it has 30-200% more storage that is not usable by the customer.

They just made a very fast, low latency, distributed mesh CDN that the customer pays Google for the connectivity AND electrical bill.

Isn't the point of a CDN to push content out close to consumers because the last mile "pipe" is significantly slower than a content-provider's network? If the consumer is already on the content-provider's network, and operating at the same speed as the backbone, you don't need a CDN.

The point of a CDN, at least a significant one, is to reduce latency. To put your content closer to your customers. It also lets you handle larger load, is more resilient to DDOS, etc, but, at least in my view, those are often secondary... the main issue is latency, and perhaps capacity.

Gigabit to your house won't remove latency, other than perhaps to apps hosted by google (including google, obviously)

A CDN also provides reduced latency and massive load balancing.

they don't really save anything on the connectivity, as everything would still be routed through Google, thus it would use exactly the same amount of bandwidth as if the storage was sitting inside a data center.

As for electricity, the cost of packaging up and sending drive units to the customer + upkeep would dwarf the costs of electricity for the total lifetime of the drives.

I don't think this really makes sense.

The idea of a CDN is that you don't route everything through Google's data centers, you just pretend to and fetch from a nearby mirror. This really does save you bandwidth - a lot of bandwidth.

Having some of the mirrors you use for this be in people's homes is an interesting twist that I had not thought of.

Why stick the mirror in the customer's home, though, where you have to mirror it once per customer? It basically becomes the same as a browser cache then. The logical place for it would be the local switching office (or whatever the equivalent is for fiber technology) - no idea if that's how it actually works, but that's where my naive knows-little-about-networking-besides-the-TCP/IP-stack mind would put it.

The idea is that one customer's request could be served off of a mirror sitting at another customer's home.

The saving compared with putting it at a local switching office is that you don't need to buy a set of special machines and hard drives that sit at that office. Instead you're leveraging underutilized machines that people have already paid you for that are sitting at their houses.

As far as the rest of the network is concerned, the traffic doesn't need to go to them so they are happy.

That seems highly inefficient, though. Now the request has to go up to the switching office and back down to another customer's home, instead of just up to the switching office. Why not just stick the cache at the switching office and eliminate one of those hops, cutting out some latency too? Disk and even memory are cheap...hell, as Patrick's preso pointed out, one of the main reasons for this project is that bandwidth performance is not increasing nearly as fast as CPU, disk, and memory.

It is less efficient on time. But the speed of light delays back and forth over your local network are still much better than the time it takes to go anywhere interesting - like a data center. So it is still a win for a consumer. (Albeit less of one than having a separate set of equipment in your office just for caching stuff.)

But that's not how fiber works. At some point, it still has to go back to a central office. You can't just connect to your neighbors directly, it all ultimately goes through a switch/router somewhere.

The point is it doesn't save overall bandwidth used, it saves bandwidth on shared/contended resources. If you have, eg a switch with 10 1Gb ports[1], and 1 1Gb uplink, and 4 of those are doing something intensive enough to saturate that uplink, someone who requests say, a full download of the gmail client, then it could go strictly across the switch to one of the other 6 local google boxes that has a cache of that, and at a lower latency and impact on other people than going to through the uplink to the nearest cache.

Now, this could also be done in the switch closet, you are right. However, since this would have to also go through either the uplink, or every switch would need a port dedicated to a cache network/box, it would start getting expensive at switching points. Each would start looking like a mini-data (micro? nano?) center. At that point, you could just eat that cost, or say "what are alternatives that cost the same or less in capex and opex?" Perhaps with Google's network-fu, they have solved similar problems in data centers already, and said "we can use our caching/routing stuff here, and put a small capex increase each customer box, which we also need no matter what, and decrease switching point capex, and since it is a simpler network, reduce opex too".

Essentially, it is a similar problem to the one bittorrent solves, just at a different scale/locality. It also starts to look like solutions some vendors/ISPs looked into at one point for bittorrent - instead of stopping bittorrent, keep a map of local people seeing segments and reroute requests for those segments to the local network rather than across the uplink.

[1] assume a decent switch with a full mesh backplane. Also assume real switches will be used with real numbers, not my exemplary ones - the analysis will be the same, but the numbers will of course be different.

And with an uplink speed that equals your downlink, this peer-to-peer CDN becomes even more practical.

the thing is the route to any "mirror" that sat in a persons home would be through Google's network. This wouldn't save them any money.

It does save them money, which is why every large ISP does it.

The fact that makes it work is that not all routes through Google's network are created equal.

Routes that go to and from data centers go a longer distance, through more pieces of equipment, and include busy backbones that you do not want to get overloaded. Routes that stay in a local neighborhood go a short distance and put load on one router which should be able to take it, and totally skip the critical backbone.

From the point of view of the network operator, going to a data center is slow and expensive. Keeping traffic inside a local neighborhood is fast and cheap. Thus they want as much traffic as possible to go the fast and cheap route.

CDNs cache data on local mirrors, and routes traffic to them whenever possible because that is faster and cheaper than going all the way to a data center. Every large ISP does this, and it would be shocking if Google didn't follow suit.

But actually caching data on hardware that is sitting at customer's houses is an interesting twist.

Yes, and putting a CDN node where the fiber terminates seems much much simpler (they might already do it for TV feeds anyway).

Also regarding the electricity, would that not be considered theft, if nothing else.

I really don't see how.

If they don't disclose that the box they gave you purportedly to record TV shows is actually also drawing power from your line and using it to run part of their datacenter? I would think that that could be considered theft by deception. It's certainly very shady.

I'll admit that the line is a bit blurry.

If they were to actually do this, they would of course disclose that fact. It would be written in small print, but still disclosed.

It's recording TV shows, it's just also serving them to your neighbors. It's the neighborly thing to do.

I don't think the fiber runs all the way to the TV box (someone correct me if I'm wrong). I have FiOS where I live and the way it works is Verizon installed an ONT (Optical Network Terminal) in my basement and then plugged the coax cabling into that box. It would be very expensive to rewire the home with fiber or even ethernet. Presumably if Google was building a CDN they wouldn't want to be reliant on standard wiring in their customers' homes. Also that sleek-looking TV box would probably have to be much larger if it doubled as an ONT (the one in my basement is huge). Seems a little impractical if my assumptions are correct.

Edit: I don't mean to dismiss your idea about the value of Google reserving some space on the disk for their own purposes. Cable & satellite operators already do that today. Technology exists for example that allows the operators to cache household-targeted TV ads on the disk. These deployments are still small scale but I think it's highly likely Google is thinking about such things as a way of monetizing their new network. If you're curious about this topic do a quick search on Google's investment in Invidi and on their partnership with Echostar.

That is an incredible idea.

that's a really interesting thought

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact