I have 10ms ping to news.ycombinator.com, and 100ms ping to www.amazon.com. Yet time to first byte is 20% faster to www.amazon.com. What actually happens is my PC connects to Cloudflare, witch in turn connects to HN. This is an unnecessary step, and is highly over-rated.
It's needed because CDNs are presently 'protection rackets' for the Internet.
Instead of having a mechanism by which a website under attack (or simply un-monitored heavy load) from a host or number of hosts can direct the ISPs of those hosts to not send it traffic the CDNs instead use the present lopsided nature of peering agreements to simply sink the hits.
In the above mentioned solution either the direct ISP would filter out the requests before they hit a higher tier ISP/backbone, or an ISP that does not provide said filtering would (possibly blacklisting the entire client ISP for sufficient bad behavior).
> Instead of having a mechanism by which a website under attack ... from a host or number of hosts can direct the ISPs of those hosts to not send it traffic
How is this any different? You're putting the responsibility of being internet police on the ISP's, which have shown their either do not want that responsibility, or will abuse that responsibility.
Not to mention, this would go squarely against the notion of ISP's as "dumb pipes".
I'd like to see ISPs doing better policing of their customers: disconnecting (and prosecuting) spammers, blocking users infected with malware until they get clean, and so on. I'd like a world in which I can reasonably expect to react to spam by tracking down where it came from and getting the spammer removed from the Internet.
abuse@ ought to function and produce a rapid response.
(That doesn't mean we shouldn't have anonymous, untrackable services as well, but using such services means you have to put up with spam too. When you have a service like email, with all the tracking information readily available in the headers, it shouldn't be as near-impossible as it currently is to get something done with that information.)
The first is large networks that tolerate spam, and those are irrelevant because their IP blocks are already on every blacklist in the world.
The second is compromised machines, which are the real problem because they're ephemeral. Spammers can compromise a hundred new machines a week. You can't fix them or block them faster than they compromise new ones. The only solution to that is to improve computer security so they don't get compromised to begin with, which has nothing to do with ISPs.
Removing people from the internet is never the answer, both because it's too broad (should the spammers not be able to go to the government's website to pay their taxes?), and because the nominal current source of the spam isn't actually where the spammer connects to the internet anyway.
The better solution if you can actually find a real life spammer is to impose a fine on them that exceeds the profits of spamming.
Wait, why can't you block spammers as fast as they are created? If your computer is compromised, it seems fine to disconnect it, or at least severely rate limit uploading until you fix the problem. Its the responsibility of the internet user to keep their computer in working order.
Really, if your computer is spamming someone, even if you aren't aware of it, you are harming them, and ignorance shouldn't be protection. Maybe fines and jail are a bit too severe, but rate limiting and possible disconnection are more then fair.
So if Joe Shmo's router is compromised, then the ISP turns it off. How then is Joe supposed to learn that he's compromised by looking up symptoms, or download an antivirus to clean it up?
It's a pretty solution to the problem in theory, but when you actually apply it, it doesn't quite work out as well.
My ISP actually did once cut me off when one of the machines on the network was compromised by malware. I called up their support line and they told me what had happened, I fixed it, and all was well.
The ISP routes any HTTP request from you to a page that says "your computer has a virus, call your ISP support line to restore access". Of course, that in turn opens up the possibility of someone else taking over your modem and telling you to call this 900 number to fix the problem... :)
> Wait, why can't you block spammers as fast as they are created?
They can literally compromise new machines faster than you can identify existing compromised machines. It doesn't do anything to impose anything on the machine owner -- as soon as you identify the machine you can blacklist it, and as soon as you blacklist it they stop using that one.
The problem is in the time it takes you to identify a compromised machine, they can compromise many more additional machines.
So you work from home and your 12yr old discovers SET and sends a few phishing emails to their friends, so now the ISP cuts you off from the Internet. Now what?
(1) There is no point in complete cutoff -- partial filtering should be enough. So if email spam is sent, outgoing SMTP port is blocked; if there is a virus spreading over random ports, all outgoing ports except http/https are blocked; if malware leaves spam on webpages, then all http/https connections are redirected to captive portal explaining how to fix.
(2) Same with spam blacklists, you get de-listed either automatically, if few weeks pass without incident, or faster by manual request.
(3) It should not matter why bad traffic was sent. A much more realistic situation is: "So you work from home and your computer gets trojan which starts to send phishing emails, so now the ISP cuts you off from the Internet. Now what?"
In either case the answer is: prove to humans @ ISP that you are no longer danger to the internet and ask them nicely to unblock you. If you work from home is that critical, have a separate internet source (say 3G modem which you activate with a phone call)
You call them and get it turned on again. Also, shut off doesn't have to be right away, it could be after a few days of intense spam, and a week of upload constrained connectivity. In other words, you can find a sensible policy.
So you work from home and your 12yr old discovers loaded revolver in night closet and shoots few friends, so now the Police come and arrest you. Now what?
> That's exactly what I'd like to see. But that does depend on cooperative ISPs.
How does that help? The ISP can tell you where the compromised machine is, but it will be a different compromised machine in an hour so that's no help. It can't tell you where the actual spammer is because they're behind five proxies in six countries.
Often it's more feasible to apply costs to a higher level of responsbility than the party directly responsible. E.g., sanctioning a country which, say, harbours pirates or highly-polluting industries.
There's some really interesting work (some of it recently from Cloudflare) on tying reputation to reliably anonymous trafic. Fair Anonymity and FAUST are the two approaches I'm familiar with, though AFAIU each are very preliminary academic proofs of concept, not anything production-ready.
There might be other ways of managing traffic, though it would require more pipe smarts.
Generally, I'm in favour of ASNs -- autonomous systems -- taking responsibility as you describe. After all, they do present a single administrative bound of control, and ought take responsibility for their traffic as you describe.
The Routeviews Project (asn.routeviews.org) offers both DNS lookups and downloadable zonefiles for mapping individual IPs to ASNs.
Of late, the problem I'm finding is that far too many services are hosted on Amazon, though firewalling off a wide swath of a few AWS AZs might be an interesting approach.
one of the major issues of malware/spam/botnets is a direct result of the hardware ISPs provide to their customers. They should at minimum be taking on that responsibility of security upgrading their rootable hardware. If abuse@ could just be triggered by honeypots that could quickly identify hacked routers & ISPs would replace/upgrade them accordingly, at least one ugly sliver of the IoT would be addressed.
Netflow data is extremely helpful, although not on its own (it will also identify legitimate customers running their own mail servers). It's a good start, though.
Analogies fail in this area because the INTER-network part of the Internet is not similar to real world examples.
In the real world making large volumes of things appear at one specific location takes effort. The real world is also fairly good about having ways to track down and prosecute those who are unusually heinous about some activity. There are costs, resources, and attachments of presence involved.
In this context the closest example would be the PTSN (Public Telephone Switched Network). That analogy still fails because while the cost is low there is still SOME cost and a lot of documentation about who might be 'making someone's phone blow up'. However from the perspective of the solution it still holds true.
On many hard-phone-lines it is possible to dial a number sequence to block calls from the number which previously called the phone. In this case the provider hired by the victim will deny the connection request before it ever reaches the victim.
Something similar can currently happen as 'blackholing' the victim, but this is victim blaming and allows the attacker to win. What is necessary is to instead push the blocking far enough back the chain that the victim sees no bill and is completely impugned from the actions of bad actors. Logically, this means blocking it at the backbone level or lower, and still charging the sending party for the data.
Postal-service analogues suggest themselves. Issues with mailstorms (chain letters and the like) were a sufficient issue that there are specific statutes against them.
When you do have cooperative ISPs filtering DDoS traffic, it's usually a null route, so the IP under attack goes offline, but without taking the rest of the customers on your ISP offline.
That's not quite how routing works -- null routing the attackers affects RETURN traffic to the attackers. By then, the damage is done, the target host is already overwhelmed with attack traffic.
Picking amazon isn't really a fair comparison, It's Amazon, the people behind AWS, the same people who say that "every 100ms of latency costs them 1% of profit." Of course amazon.com and its servers are ultra-optimized to make everything as fast as possible.
A fair comparison would be hacker news with vs hacker news without cloudfare?
Cloudflare does not cache HTML pages. Images might load faster, but the time to first byte for the main HTML document is still much higher with Cloudflare.
There is actually weird magic around keeping open tcp sessions open so you could be faster if close to a cloudflare node and distant to the server, too, even if not cached.
The issue with CloudFlare here is that they never really cared to be a good "normal CDN", which would make this performance boost possible by using a combination of intelligent caching and an extremely large number of POPs using good trunks; see Akamai or CDNetworks for typical examples of trying to actually play the game of a normal CDN.
Instead, CloudFlare relies on crazy content transforms that started as essentially "what if we run mod_pagespeed for you in the cloud as a service" and then expanded in scope from there. I mean, even when they cache things, as I understand it their cache hit ratio is extremely poor (whether due to restrictive cache sizes or limited retention windows, I don't even venture to guess).
This means that if your content isn't inherently slow (using poorly compressed images and tons of fragmented JavaScript and CSS files with broken ETags being consumed by HTML with script tags in "the wrong places", none of which is minified), CloudFlare is going to make your site slower.
Really: the two things they ended up getting stuck in the ecosystem for were 1) being free or at a fixed low price for supposedly infinite bandwidth (though people report that when you actually use a ton, CloudFlare claims you are being attacked and need to upgrade to their protection racket, which leads us to), and 2) performing DoS attack protection by using a ton of annoying heuristics and IP filters and JavaScript delays and even captchas that you should notice no "real" site behind a normal CDN ever seems to need, and yet somehow is a hallmark of the experience of "this site is using CloudFlare".
I've done a fair bit of testing and while their cache rate wasn't as high as Akamai, it wasn't as bad as you're making it sound: something like 50% vs. 60%.
One of the challenges here is popularity: if you're a major site, yes, Akamai will have content cached at most of those 140k POPs but sites with different blends of cacheability and global traffic distribution might see a better cache hit ratio with fewer POPs so things stay hot in the cache longer.
These are complicated services so the results are going to bay widely depending on what you need (e.g. having an optimized site means that every optimizer service I've tried made performance worse), and how much you can adjust your application to work with the CDN and tech changes like HTTP/2. The other reason why blanket statements aren't worth the time needed to make them is that these are not free services and you can get easily find cases where service A is better than service B but not enough so to justify the price.
(Add for the DoS tools, note that that's pronounced “feature” by most people and at least personally I never see them except when testing through a commercial hosting company's network. Tor users whine a lot about them but that's just entitlement and unwillingness to think critically about what a site owner sees from their peers.)
FWIW, I'm used to seeing >95% from working with CDNetworks (and I believe that's even in their literature somewhere), which has a kind of (to borrow a term from a different kind of service) "supernode"-like model for their cache layer, proactively distributes cached resources, and does revalidation as a separate path from fetches.
As for their DoS "features", yes: they have a bunch of features because it makes them look like they are doing something worth value, but I am going to again remind you that somehow you never see websites that are using normal CDNs suddenly give up and give you an Akamai-branded captcha, and yet somehow they do not crumble under the weight of denial of service attacks; it just isn't the case that using CloudFlare is the only way to be safe from DDoS, and there's something weird about how the experience of using a CloudFlare website somehow is noticable.
Wait, are you arguing that using a CDN is overrated? For everything, or just for HN?
A CDN is extremely valuable for a website that has cacheable content, since it allows a site to scale dynamically with increased load. This is the same reason people use AWS or any cloud provider; a website has to have the capacity to serve their highest peak traffic, but it is needlessly expensive to maintain that much infrastructure around the clock.
Helps a ton if you're terminating SSL at the edge, due to the number of roundtrips you need before the server can even start generating that first byte.
> This is an unnecessary step, and is highly over-rated
Given the volume of traffic HN sees, and given their desire to not be DDoS'ed, it seems to be a necessary step, that is not over-rated.
To be clear, you'd see similar results with any CDN sitting in front of HN... The benefits really start to show (from an end-user's perspective) when you're on poor connections or are far away from their (HN's, in this case) server/cluster. HN sees immediate benefits for the things above, plus it saves them a lot of bandwidth.
Yeah absolutely this. They spin everything they do as some kind of heroic "for the people!" decision even when it's just about cutting costs or not having to solve "hard" problems. One example are DNS "any" queries. Cloudflare just decided to toss standards out because they aren't up to conforming to them. As far as I'm concerned, this Cloudbleed thing is karma, and nobody should believe anything Cloudflare says about itself.
To be honest though, ANY is mostly used for reflection/amplification attacks or to scrape domain records. Sure there are legitimate uses for them but I can't think of many that need to happen over the internet vs. only allowing that from specific trusted parties. And yes I'm aware that it can be useful as a debugging tool. There are also ISP recursors that don't allow ANY queries through for similar reasons so relying on it will cause trouble for some and there are other broken implementations in the wild. Though ANY can be useful it shouldn't be assumed that it'll work.
What I'm thinking they could've done in the case of ANY is to respond with truncated and switch to TCP mode at which point it becomes harder to do the spoofing dance and (ab)use ANY as a DNS amplification attack to DDoS a target. Unfortunately that does put an extra cost on the DNS server which might also be undesirable. However that would've probably been a worthwhile tradeoff.
Well, two out of the three people proposing that document work for Cloudflare. The other works for Dyn. It's clear what their motivation is. And this is not the first time they've filed a document just like that, and it will expire this year just like it has in the past. That document is bullshit. It's trying to now change the standard after they've already broken their DNS implementations.
Your comment doesn't address ANY of the arguments you're replying to. Are ANY queries typically used outside of attacks, scraping, or diagnosis? Is there a reason they need to be served over UDP?
> Well, two out of the three people proposing that document work for Cloudflare. The other works for Dyn.
Two companies that have some experience with dealing with DNS attacks. Their motivation may be self-serving, but they're not the ones doing the attacking.
Lovely bit of debugging in this article, I really enjoyed it! Somehow a task that would be grueling to do myself is so much more enjoyable when read about.
We need to switch dump pipes to be dumb content-addressable p2p pipes with maidsafe, ipfs, dat, or anything like that, and most problems that CDNs are trying to solve would disapear.
So, there was (I hope not "is"... I am having a difficult time remembering the sequence of events involving this bug and what I might have done on the WebKit side) definitely a bug in the browser, but just to be explicit: there was also a bug in the JavaScript (the array returned as object with no length, causing the code to loop), and managing to deploy a change that managed to DoS tons of deployed browsers and somehow not noticing and then not really caring afterwards was extremely careless.