Hacker News new | past | comments | ask | show | jobs | submit login

Shower thought: what if html/http/browsers supported, as a primitive, the concept of "fetch this asset from url A, or if that doesn't work, B, or if that doesn't work, C ..."?

If video is being served via HLS (which it probably is in 2018), then the manifests support redundant streams, where multiple hosts can be specified for each stream. [0]

hls.js supports this, as do many other clients. IME it works nicely for providing some client-side switching in case one of your hosts/CDNs goes down.

[0] https://developer.apple.com/library/archive/documentation/Ne...

Both HLS and DASH support redundant streams (by adding a redundant variant URL in the HLS playlist and multiple BaseURLs in the DASH manifest). It's indeed the simplest way to have the easiest way to have a fallback client-side. If you use it, you should make sure that the player supports it, and that the retry mechanisms are rightly configured (like for instance all the MaxRetry config params in hls.js: https://github.com/video-dev/hls.js/blob/master/docs/API.md#... )

"If that doesn't work" isn't the problem.

As a silly limiting example, imagine that you host Netflix on your dial-up connection as url A.

It works.

Oh, okay, right, let's set a timeout then, if it takes more than 1 second to load, we try url B.

That works, but now we've got a 1 second delay on everything. Okay, we'll update the default to be url B.

Conditions are changing all the time as a result of bottlenecks in the infrastructure moving about.

What I think you'd actually need to do is something like this - initially, fetch from multiple endpoints simultaneously with an early-cancel (so you don't waste bandwidth on the slower ones).

For N seconds you just use the fastest one (perhaps with an 'if it doesn't work' mechanism, sure).

Every N seconds you re-evaluate the fastest endpoint using the multi-fetch.

And so on and so forth.

There are better algorithms, this is back of the envelope stuff.

You solution to bandwidth congestion is for everyone to use 3x+ more bandwidth than they need?

Is this a bot?

Firstly, I'm not solving anything. I'm explaining why fallback URLs are not equivalent to CDNs.

You don't use a CDN because your site doesn't work, you use it because it's faster.

Secondly, no, doing an occasional speed test, using data you'd be downloading anyway, then selecting an endpoint between speedtests does not increase bandwidth usage by 3x.


Some people use CDNs as regular static-site web hosts, or hosts of an SPA client JS blob when they have an otherwise-"serverless" architecture. CDNs are not always about serving large media assets.

They're not consuming 3x bandwidth if they're bailing out after downloading less then a kilobyte of a video that's tends or hundreds of megabytes in size.

That extra bandwidth is a rounding error in the grand scheme of things.

It could be important, though, for the client to signal the server to close the connection. Theoretically the connection would drop after several seconds and the server would stop transmitting, but I could imagine some middleware cheerfully downloading the whole stream and throwing it away.

The problem with with this approach is that you're only considering time to first byte, which is part of the equation especially in the case of smaller files like scripts, but in case of larger files like video segments, throughput is more important. If you only wait for 1kB to download, then you essentially measure time to first byte.

Then the instruction to stop the download is not instantaneous, so by the time you realize you have downloaded 1kB on the client side, the server might already have sent the whole video segment on the other side, so this is not the way to go in order to optimize congestion

For video you can fetch different chunks from different endpoints simultaneously, not the same chunk, therefore not wasting bandwidth at all.

This is more or less what we do with our our client-side switching solution at Streamroot: we first make sure the user has enough video segments in its buffer, and then we try to get the next video segments for different CDNs, so we're able to compare the download speeds, latency and RTT between the different CDNs without adding any overhead. You don't necessarily download the segments at the same time, but with some estimation and smoothing algoriths you're able to have some meaningful scores for each CDN. The concepts and problematics here are very close to the problematic of bandwidth estimation for HTTP Adaptive Bitrate streaming formats like HLS & DASH, because you have an instable network, and you are only able to estimate the bandwidth from discrete segment measurements. [0]

If you want to do it on a sub-asset (video segment, or image or JS file) level, it's possible by doing byte-range requests (ask bytes 0-100 from CDN A and 101-200 from CDN B), but in that case you still add some overhead for establishing the TCP connection, and in the end as you need the whole asset to use it, you'll just limit the download speed to the minimum of the two.


I think that could be very beneficial. If it were a built in feature for HTTP (or more broadly level, maybe TCP/IP), it would not only save people the hassle of reinventing the wheel, it would also be easier to ensure it's on by default for all static resources and thus get benefit across the board.

Perhaps it could be done in a flexible, extensible way as well. Create a limited language (no loops or dangerous stuff) to express policy, search order, etc. And design it so the client side doesn't necessarily have carte blanche and the server side can maintain some control if necessary.

Internal browser support for local caching based on a hash versus "where it came from" would be helpful as well.

Yes that would be great. Imagine a git-like web, where browsers could fetch the difference in chunks of a cached file when there's changes on the server side.

Doesn't work for streaming Live Events (pay per view), which is the main use case for multi-CDN

Their list of redundancy, agility, and cost doesn't seem exclusive to video. Though perhaps more compelling given the time sensitive nature and bandwidth amount.

The problem is that that leaks information about your viewing habits on one site, to another.

Can't you do that with DNS records where there are multiple IPs on a A record?

Basically, we're already doing this for fault tolerance and load balancing within a single CDN. Except that currently we randomize the IPs. To enforce priorities, you'd want the IPs in the A record at least partially ordered by provider.

Multiple IPs on an A record works to some extent, most (many?) browsers will silently retry another ip from the list if some of the IPs don't accept a connection; I don't know if they'll try another IP on timeout though.

But you can't actually expect any ordering to make it through to the client. Your authoritative server may reorder the records, their recursive server may reorder the records, and the client resolution library may also reorder the records. There's actually an RFC advocating reordering records in client libraries; it's fairly misguided, but it exists in the wild. Reordering is also likely to happen in OS dns caches where those are used.

For reference, RFC 3484 [1] is the misguided RFC that tells people they should sort their DNS responses to maximize the common prefix between the source and destination address. This is probably helpful when the common prefix is meaningful, but when numbering agencies give out neighboring /24's to unrelated networks, and related networks often have ips widely distributed across the overall address space, it's not actually useful.

[1] https://www.ietf.org/rfc/rfc3484.txt

Thanks for clarifying!

That's not quite what an anycast to those addresses does. It's more like an approximation of the nearest server.

They do for some limited items e.g. <object> does nested fallback.

That's not sufficient for something like CDN selection though, you want a fallback in case of failure but you first want to select based on various criteria.

Combine with SRI and some convention to just ask one of several hosts for it based on the hash(es) and we have concent-addressable loading.


IPFS would be a particular implementation of those more general concepts. SRI + multiple HTTP sources would be another more incremental approach.

I'm still waiting for a browser to figure out I mean "com" when I typed "cim," and you're thinking CDN retries would work?

Are you thinking of something like BitTorrent? Why ask for the whole file from a list of hosts when you could ask for any bit of the file they might have?

Then people would create browser plugins or greasemonkey scripts to always optimise for things the viewer cares about (time to start, likelihood of switching providers mid-stream, likelihood of getting full resolution for the longest subset of the video, ...) and disregard the prioritisation set by the provider (which might care about costs, which depend on contracted minimums, overage tariffs etc.).

Then providers would need to combat this by dropping the most expensive CDNs, causing a race to the bottom in which everyone loses: users have worse streaming experience, providers lose customers, good CDNs make less money, margins for bad CDNs are squeezed.

The people who would install such kind of add-on is so small it would have like zero effect on the provider's costs …

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact