Hacker News new | past | comments | ask | show | jobs | submit login
The internet is held together with spit and baling wire (krebsonsecurity.com)
374 points by picture 56 days ago | hide | past | favorite | 164 comments

It's Anti-Fragile. If it breaks all the time, everybody is highly experienced at patching together new workarounds, mechanisms for fail over are in place and regularly tested, and there's whole classes of corner case bugs that get flushed out to be stomped (or nurtured as cherished pets) instead of breeding in the dark and jumping out at you all at once.

How can the "Internet routes around failure" be trusted without testing? Everything needs regular exercise or it atrophies.

"The internet routes around failure" hasn't been true for a long time. It refers to the original topography which has been replaced with a hub and spoke model. Remove a few hubs and you have disabled a large portion of the internet.

Back in the 90s when I got on, so much of traffic was exchanged at MAE-West or MAE-East, and a backhoe in Iowa could make nearly all the cross-US traffic go through Europe and Asia instead.

These days, there are lively public internet exchanges up and down both coasts, in texas and chicago and elsewhere. A well placed backhoe can still make a big mess, many 'redundant fibers' are in the same strand, and last mile is fragile, but if my ISP network is up to the local internet exchange, there are many distinct routes to the other coast and a fiber cut is unlikely to route my traffic around the world.

I wonder if I can nerdsnipe anyone into figuring out the fewest BGP hijacks it would take to force traffic on the Internet from the West Coast of the US to take the long way around to the East Coast of the US. Never mind the latency, that pipe isn't big enough for all of that traffic, so it'll effectively be a netsplit.

I bet it's a lower number than anyone's actually comfortable with, though it would be rather difficult to pull of.

I doubt that you could force all such traffic to be so routed, given that there are coast-to-coast transit networks that would not be affected by a BGP hijack (IGP routes are often preferred to EGP routes). In all likelihood the situation would not last very long as network operators would quickly respond to the anomalous traffic pattern and deploy various route filters to correct the problem.

> Never mind the latency, that pipe isn't big enough for all of that traffic, so it'll effectively be a netsplit.

We've seen this one before... BGP doesn't (usually) look at link utilization, just link status and maybe that the BGP connection doesn't time out, but even a massively overloaded connection tends to have room for BGP pings, so it's less netsplit and more blackhole.

I don't think BGP hijacks is really the way to acheive this one.

God, this would happen all the time in the mid-90s. You joke about a backhoe in Iowa, but I'm pretty certain it was a backhoe in Iowa one time. Then the whole of the UK's traffic to the USA West Coast would be routed through an ISDN connection in Korea for three days. Your goat porn downloads from Usenet would drop from their high of 0.5Kb/sec and you'd be left with your dick in your hand. Or something.

Kids today will never know how we suffered in the Before Times

boobs.jpg loading one line at a time and then greying out completely before reaching the bottom

Good thing I'm into legs... They loaded first.

only if it was BMP

This is where you'd choose the .gif version hoping it was interlaced.

Ofc they will, people are recording the history of porn downlo... err the internet just as diligently as Napoleon conquests !

They will never experience buying bootleg disk for gif porn.

edit: disk is for diskette

This is why I always pre-download pr0n well-ahead of jack0ff time.

Call it BBP Protocol... "Blue Ball Prevention"

This happened with regular phone service as recent as 2017 in Canada (1).

(1) https://atlantic.ctvnews.ca/mobile/many-atlantic-canadians-l...

This is still an issue AFAIK, a disproportionate chunk of internet traffic goes through AWS-East...

> Remove a few hubs and you have disabled a large portion of the internet.

"A few hubs"? Just Hurricane Electric has presence in many, many IXPs, and they're not even the largest transit provider:

* https://bgp.he.net/AS6939#_ix

AT&T, the largest transit provider, has a ridiculous number of peers:

* https://bgp.he.net/AS7018#_peers

And there are several different major global providers:

* https://en.wikipedia.org/wiki/Tier_1_network

A lot of the Big Tech companies are also building their own private fibre networks so they don't even have to worry about sharing infrastructure with the telcos:

* https://www.submarinecablemap.com

Knock out this set of buildings in London, and you can kiss the entirety of the UK offline/most of Europe won't recover: https://en.wikipedia.org/wiki/Telehouse_Europe

It took exactly one (1) spanning tree misconfiguration on a single BlackDiamond to cripple LINX for hours, a couple of decades ago.

Source: I was sitting next to the person who fixed it.

You’re wrong. If say One Wilshire[1] or one of the very few other carrier hotels in its class abruptly ceased existing the Internet would be wrecked and rebuilding would be a matter of months at the very best. It doesn’t matter how many peers a big telco has when the supermajority of backbone peerings are in a handful of buildings.

[1] https://one-wilshire.com/

I used to have a security badge to get in there back in 2003 or so. When I was shown around someone said they could cause a minor recession in 5 minutes with an axe. Also they had me grab a fibre that carried all of Japan’s internet traffic. If you pulled that it would route everything the other way around the world through Europe and the east coast.

I had a long lost picture when they were doing some street construction outside and marked all the buried cables with spray paint. Lines going everywhere…

First off, I'm on the east coast-ish, so I'm not sure how much traffic I get from APAC.

Next off, the their front page says:

> Carrier-Neutral with Access to 200+ Carriers

You're telling me these 200+ carriers have no other POPs?

Going to their "Connectivity" page, they show four trans-Pacific cable that land there:

* https://one-wilshire.com/connectivity/

Meanwhile there are a whole bunch of other trans-Pacific cables landing on the US West Coast:

* https://www.submarinecablemap.com

As someone who lives in Toronto, I certainly worry about how concentrated TORIX is, but even it has three physical locations (IIRC), and most folks connect elsewhere as well from the spelunking I've done in BGP.

So if one of these major carrier hotels does have a problem, I don't doubt that there would be repercussions, but I'm not worried about "the Internet" as a whole.

...maybe. Pretty sure the "wreck" would be local to the region around LA, and maybe a few of the countries on the other end of those lines. The immediate effect of an unexpected outage (e.g. the physical destruction of the building) would probably be a lot of rerouted traffic overwhelming systems "nearby" (in the topology of the Internet) for a few hours. It would not take months to recover even in the absolute worst case.

As a network person who's worked at multiple major global transit networks: lol no.

That building gives me anxiety attacks; just thinking about how vulnerable it is.

Not the whole internet, but a whole country for sure: google "Las Toninas" a beachside town in Argentina. Someone over there frequently steps over the wire and good bye rest of the world

I didn't RTFM but I could see the argument that the SPOFs of the Internet are AWS, Cloudflare and Google, not the cables and routers in an IXP.

You alread listed 3 single points of failure. I get what you mean: the entry point search engine, the global cache and the global backend/storage/compute.

But look I live in Hong Kong, I dont feel that way: there are other backends, we could survive without the caching for a while and Google is forbidden for 1.4bn people who get on with it very well...

Depends what you call the internet. Yes, facebook and whatsapp are gone the minute one of those 3 companies screwed up.

OK, you think 1.4 billion people without the freedom to search is ok? LOL.

I wouldn't say Google is search, per se. I haven't used Google for search in a few years.

If Google search went away for more than 24 hours, Bing/DDG/others would quickly get overwhelmed as people learned about alternatives and switched.

Bing might be able to auto-scale using Azure infrastructure but there might not be enough hardware even there.

WWW != Internet

Yes, for the majority of users the two may be the same, but "The Internet" is the IPv4 and IPv6 network on which the web runs. The equivalent companies to what you named are the transit network operators, companies like Hurricane Electric, AT&T, CenturyLink, etc. The list of transit networks is very long, much longer than the list of major web hosting companies.

Now, that said, it is absolutely true that there are ways where major outages can be the result of just one failure. Last year there was a major outage on the US east cost after a Verizon fiber optic line was severed in NYC; it turns out that this line carried many supposedly redundant links, all bundled into a single cable. The failure of that line then caused a cascade of failures as rerouted traffic began overwhelming other systems, but in the end the outage was contained in a relatively small geographic region, albeit one with many Internet users.

The article is about BGP and Internet Routing Registry (IRR): routing.

Those are not single points.

back in the 90s all the Internet traffic in Italy was routed around two big hubs, hosted in two public universities (it was mainly one, in Rome).

Internet is much more reliable now.


I studied CS in Rome at "La Sapienza" and I was merely 10 steps away from INFN (national institute for nuclear physics) where GARR phisically resided.

I spent more time there than in my class.

That's how I got involved in this new "internet thing".

Bob Metcalfe repeatedly predicted a "gigalapse" which he defined as a loss of one billion hours of Internet connectivity. Bob would say look at how fragile this all is, and his point wasn't (although at the time it would too often be portrayed this way) "the Internet will fail" because Bob isn't actually an idiot, but it also shows a substantial lack of foresight anyway. Of course ~everybody will have the Network, the Internet is the Network, and so then normal sleep is an order of magnitude worse than a "gigalapse" yet happens every day and we don't even care.

The "gigalapse" seemed like a big deal because proportionally a billion user hours in 1995 is everything doesn't work for days, or most things don't work for several weeks... but today of course it could also be everybody in the world is inconvenienced for about long enough to make breakfast. Oh no.

If a Gigalapse is about 12 minutes these days (given 4.66 billion people on the Internet), a Teralapse would be about 9 days of no Internet for all of us.

What a fascinating experiment that would be. I’d imagine the majority of people would have no idea what to do without the internet for days. Does my remote job e been exist? It would be surreal

Disabled? Throttled down the speed by half? Yes. But to disconnect whole regions, you have to do conscious sabotage and even Mubarak did not manage to switch off Egypt when he wanted to and gave orders to.

If by “remove a few hubs” you mean level a few major colo’s then ok, but as far as I can tell there is not any single strand of glass or single switch or single server that can take everything down with it.

You don't need to level the colos, you just need someone to make a typo in a router config that gets deployed live so the whole colo is unreachable. How many times have we seen an AWS/CloudFlare/otherLargeProvider have this happen to them?

The DNS root is pretty well distributed. There's 13 different IPs run by several different organizations, and AFAIK they're all running anycast these days. And anyway, the root zone is tiny, changes infrequently and could really be AXFRed a couple times a month and you wouldn't miss much.

The larger tlds aren't quite as diversely hosted and certainly aren't amenable to long term caching, but it should take a major f up to break those too.

Some of the minor tlds, even the more popular ones do screw up from time to time though.

> As of 11/26/2021 10:53 p.m., the root server system consists of 1477 instances operated by the 12 independent root server operators.

* https://root-servers.org

* https://en.wikipedia.org/wiki/Root_name_server

And hub here could refer to entity like Level3... One day, no more money. Just turn of the power on everything and start liquidating... Or someone isn't there anymore to fix stuff when it breaks or changes are needed...

What original topology are you talking about? There was never a time when your end consumer device could be used as even a semi-reliable web server.

> There was never a time when your end consumer device could be used as even a semi-reliable web server.

I dunno man, My used T420 laptop is serving several sites over a residential symmetrical fiber connection just fine.

My first self-hosted web server was running on a 486 and ran for a few years hosting four or so sites, including a php-based forum!

Built-in UPS and KVM, too. Keep it cool and stock up on PSUs and that'll last you a long time.

> There was never a time when your end consumer device could be used as even a semi-reliable web server.

Since about 1999 I have never NOT been running a server of some kind off my home connection. I wouldn't run a business off of it that way, but it's reliable enough to count on it which has to meet any sane definition of semi-reliable. The two biggest problems I've had have been essentially unrelated to the internet. The first is when I've been violating TOS of the ISP or the power was unreliable in the place I lived.


From yesterday. Note the bit about being on a Pi. Stood up quite admirably to an HN hug.

From 1998 to 2004 I served a fairly large amount of traffic out of my homebuilt pentium in a trailer out in the woods with an ISDN line. We stood up to several media mentions OK.

> Stood up quite admirably to an HN hug.

It was down when I tried it yesterday.

If you have a cable connection today, you can serve reliably just fine. Throughput isn’t the best and there are other minor issues, but it’s reliable enough for most people’s purposes.

Coaxial cable connections in the US have such meager upload bandwidth that cable ISPs do not even bother advertising or specifying a minimum upload bandwidth.

Yeah, but I’ve been able to stream music from my desktop to my phone while driving and run a web server with reasonable performance on one.

I consider multiple HD video streams to be reasonable performance. I have a family of 4 to 6 which at any point in time may be FaceTiming, video calling for work, video gaming, backing up to iCloud, streaming HD video from home NAS, and 5+ security cameras uploading.

It's more than adequate for personal use. I've also been serving myself and a few friends for years.

Download being 300Mpbs but upload 20Mbps IS kind of irritating though.

When did 100Mbps become popular for home LANs even?

Here in Canada all the modem/router combo units from large providers are gigabit for LAN. In pretty sure that it's been that way for at least 5 years.

I'm pretty sure new PCs and laptops have had gigabit standard for probably about 10 years.

Enthusiast and prosumer motherboards are now coming with 2.5 gigabit networking.

Covid lockdowns helped with video calls, but ever since FaceTime came out, lack of upload became way more noticeable

Also, smartphone proliferation made streaming from home NAS very convenient, as well as home security cameras and smart home features.

I disagree that current upload capacities are adequate. With 1Gbps+ upload connections standard, we might actually see privacy forward solutions that do not require us to depend on cloud services.

I worked at a small ISP/web host in the late 90s. A very common web server at the time was a cheap white box PC running Windows NT and IIS. We had many of these systems on this interesting “rack” that was more like a multi level table or three d chess board.

These systems could handle a ton of traffic very reliably. Consider that there was very little dynamic content and what there was barely taxed the CPU at all (e.g. a Perl CGI script to query and display some tiny amount of data from (LOL!) an Access database.

If it works well and works reliably, even when it needs to be changed quickly - it's not a bad solution, unless it costs too much.

More P2P between research centers on the time of ARPANet; though it wasn't widely used or available back then.

This has big "can you call it broken if it never actually worked in the first place?" energy.

EDIT: I have a soft spot for the person who is so over it that they sit around pontificating the meaning of the word broken instead of actually getting up and helping in the midst of an emergency. Because half the time that's me. When prod is on fire every day, eventually that becomes the norm.

It's much worse than that: it becomes philosophy.

Beyond some level of complexity a system isn't one, anymore, it is an emergent phenomenon that has many of the aspects we ascribe to "life." Our inputs affect the systems, surely, but not in a predictable or even repeatable way.

A cow is an "inefficient means" of turning grass into tasty beef, by some accounting. But the cow is, as far as it is concerned, fully efficient at being alive. our notion of "what is the internet" fell out of sync with the realities sometime around '94 i think.

I would call the Internet resilient, in that lower level failures are often reliably re-routed, healed, etc., but at higher levels the same kinds of failures happen over and over.

Anti-Fragile would imply that each doh-inducing failure resulted in internet wide improvements. Something that famously is not happening.

The Internet: "It's Anti-Fragile!" (looking up)

The Internet: "It's Duct Tape all the Way Down" (seen from above)

It s almost alive, it'll wake up one day ask start asking questions :D

Hah, that's a good way of seeing it.


O'Brien: I'm afraid to touch anything. It's all cross-circuited and patched together - I can't make head nor tails of it. Bashir: Sounds like one of your repair jobs.

Seriously though, it seems like every form of infrastructure we rely on is held together in such a fragile manner. I hate hate to think of the chaos should there be a major Internet and physical infra failure in close proximity, time-wise.

Unless you exist in a state of wild over abundance and are very conscientious, most working infrastructure is always at a state of almost-disrepair. If I operate a machine shop and I have 4 partially-stripped screwdrivers, I might get a new one, but inevitably I will use it until it is as stripped as all the rest, and I don’t dare throw away a tool that is currently even in partially working condition since I might need it later if another driver breaks. This is true of every single thing in my shop, and the end result is a system that is robust to absolute failure but prone to constantly needing to be patched up.

As far as I can tell this is a truism across all fields: farming, skilled trades, build systems, transportation infrastructure, housing…

A corollary to this might be "a new part on the shelf is not truly 'better' than a 90% worn out one still in service."

Having ready spares is great but I'm sure the HN crowd knows how much crib-death there is on new, replacement bits of all kinds.

Until it's actually been run-in for a while how do you really know it's going to work when you unbox it to replace a truly dead piece?

The upshot of this is' RAID-6' type designs are the most reliable in the real world since when one fails at least you are leaning on other parts that have been run-in and are past the leading edge of the bathtub curve.

>Having ready spares is great but I'm sure the HN crowd knows how much crib-death there is on new, replacement bits of all kinds. >Until it's actually been run-in for a while how do you really know it's going to work when you unbox it to replace a truly dead piece?

We had the unfortunate experience of installing 4 new 16 bay chasis with brand new drives (10+ years ago now). We designated one of the 16s as hot spare for each chasis, plus had 2 cold spares for each as well. 72 brand new drives in total. All from the same batch of drives from the manufacture. Set them all up on a Friday and configured all for RAID5 (pre-RAID6 availability). Plan was to let them build and have some burn-in time over the weekend for possible Monday availability. Monday provided us with multiple drive failures in each chasis. Drive manufacturer confirmed a batch batch from whichever plant, replaced all and delivered larger sizes for replacements. Luckily, they failed during burn-in rather than 1 week after deploying.

Are your mills in a similar state? I’m no machinist, but I would think it hard to stay in business when the means of production are constantly down for unplanned service.

Screwdrivers are more of a consumable than a capital good.

They said “If I operate a machine shop”.

So presumably just another programmer making up an analogy with insufficient background knowledge so that we can nitpick the details of it.

Presumably. Those statements would be insulting to most professional machinists, carpenters and others who take good care of the tools/infrastructure that provide their job and keeps their digits in place. To not speak of an actual industrial facility producing high-quality or high-volume items.

In fact, so much care is put into infrastructure that most people/shops have lots of tools they have designed and built themselves at great expense to streamline their operations.

I don't know why yo'd make arguments by analogy, which are flawed by default, on a topic you are not versed in

machine shops are full of old mills. It's a capital item, you use it as long as you can to extract value, and they last a looong time.

Imagine you use that last screwdriver, and a job needs to get done, and your car won't start so you can't get a new one at a hardware store outside of walking or cycling distance. And now you're unable to repair a vital system.

As I was writing that, I was just thinking of the recent Surfside collapse, what would would have happened if the regions data networks had gone down simultaneously (by chance). A major event two decades ago, cellular networks were overwhelmed and calls could not be made. I dare say we're more reliant on those networks today, as well as the Internet.

Not that I would expect it to happen, but it was just a thought.

if you have a file and a piece of round you have a screwdriver in whatever shape you need it to be.

You could also break something with your dull tools.

This is why you can't just drive a car until it breaks, you get it checked out.

Then again, cars are a private good. When it's your property vs our property you have more of an incentive to take care of it.

Machine shops are also private goods and their owners have a lot of incentive to keep them working, so I don't think this example is very accurate. In any case the chance of breaking anything with a partially stripped screwdriver is pretty minimal.

If you don't know what you're doing you can strip a screw.

Getting it out won't be fun.

if you're at home, with no other tools, it could be painful.

but in a shop unless you can't take it out using other means (pliers for example or a nut split remover), you can simply weld a bolt on the screw's head and use a wrench to unscrew it.

or drill a hole through it after removing the head and use another screw to take out the moncone from the other side.

Bad screws are more common than bad screwdrivers and even a brand new screwdriver could lead to the same result.

as the original post said "If it breaks all the time, everybody is highly experienced at patching together new workarounds"

Let's just keep going off this hypothetical, it would still be a better idea to have working screwdrivers.

All the steps you outlined risk damaging the part you're working on, or wasting time. Let's just say you only value your time as a mechanic at $30 an hour if getting that screw out takes you another 20 or 30 minutes, you would have been better off buying a $5 screw bit

not hypothetically speaking, it takes a couple of minutes

but what the original post said it's not that you shouldn't buy a new screwdriver, but that it you know something will break, you learn how to fix it.

if your brand new screwdriver ends up stripping that screw anyway and you don't know what to do next because you thought that it would never happen using a working screwdriver, well. you're screwed :)

sometimes there are no good solutions, only bad solutions

that's what keep things working.

if everyone believed that spit and bailing wire was a no go, because they are bad solutions and there's a better one - there always is - (i.e. fixing the root cause, after having identified it) there would be no internet as we know it today.

Software too, unless every small piece of it has a dedicated full-time maintainer. If there's an infinite stream of tasks incoming, why go improve something that fits the current use well?

We have a pretty good track record keeping it running though; The Internet has never gone down!

I'd be far more concerned with agricultural logistics, though we've never starved to extinction, either.

Perhaps we've reached a point of positive no return: We can no longer cease to exist!

In this context it's a very important point that there isn't one Internet and there isn't one "we." Unless you strictly mean the human race has yet to go extinct due to starvation.

A lot of starvation events have of course occurred across many different nations, peoples, civilizations. Europe, as one example, was numerous times ravaged by extreme starvation events that collectively killed millions of people across the 19th and 20th centuries. I think your agriculture concerns are well placed.

The various Internets have gone down routinely for all sorts of reasons.

There’s a strong probabilistic argument that says otherwise: https://en.wikipedia.org/wiki/Doomsday_argument

Note that the argument is not dependent on any sort of cause, such as climate change or whatever. It is entirely probabilistic.

> “LEVEL 3 is the last IRR operator which allows the use of this method, although they have discouraged its use since at least 2012,” Korab told KrebsOnSecurity. “Other IRR operators have fully deprecated MAIL-FROM.”

I'd prefer if we kept deprecated and removed as two different terms. It sounds like level3 deprecated it, and everyone else removed it. To me (and most definitions I can find) deprecated basically means "don't start using it, if you are using it stop using it, we will remove it soon but have not done so yet for compatibility reasons"

Has no one here seen:


There are 1000s of these pictures.

The entire world's IT infrastructure has been held together with spaghetti-noodle-cabling and bodged patches for over three decades now. I'm even guilty of it. Most IT guys are guilty of it.

Don't even get me started talking about how bodged together corporate codebases are. Banks are perhaps the worst offenders. Old hardware running with bandaids and bubblegum. Software that more than 5 different teams have had their fingers in, mucking about, and some codebases have orders of magnitude more than that.

Anyone that didn't know this, doesn't pay attention, or just started internetting.

If you haven't read it https://www.stilldrinking.org/programming-sucks beautifully describes this and has both made me want to quit and start a farm and want to stay and improve things depending on the day.

If you (the collective you, everyone reading this) haven't, read it.

I had a friend who worked in IT in a major Canadian bank some years back. He said they’d lost the source code (Cobol, I would imagine) for certain jobs and had taken to editing the binaries directly with a hex editor.

Have literally seen major things taken out because somebody tripped over the $5 power strip to the sole authoritative nameserver and nobody realized/cared for years that the authoritative ns2/ns3 weren't really receiving zone transfers.

> Don't even get me started talking about how bodged together corporate codebases are. Banks are perhaps the worst offenders. Old hardware running with bandaids and bubblegum. Software that more than 5 different teams have had their fingers in, mucking about, and some codebases have orders of magnitude more than that.

And yet, perhaps counterintuitively, the main downside of such systems is that they are slow and expensive to change, not that they are unreliable.

> And yet, perhaps counterintuitively, the main downside of such systems is that they are slow and expensive to change, not that they are unreliable.

They are slow and expensive to change because they are maintained because they have been, at great expense and cost in both failures and remediation efforts, made tolerably reliable (but still extremely fragile) so long as things are exactly within certain expectations (which often have been narrowed from the intended design based on observed bugs that have been deemed too expensive to fix), and it is inordinately difficult to modify them without causing them to revert to a state of intolerable unreliability.

They are systems that generations have been spent reshaping business operations around their bugs to make them “reliable”.

How "pretty" your cable routing is doesn't effect performance / reliability.

Pretty in this context mostly means predictable, which does have an effect. If you are called to fix a broken connection and you get a rats nest it will take a lot longer and might impact other users.

So is society. When you’re a kid, it seems like everything is incredibly well organized and orchestrated.

As an adult, you realize the world is a patchwork of semi-functioning systems and it’s a miracle the whole thing works as well as it does.

I've been thinking along these lines a lot recently. When young I thought everything was all carefully architected, thought out and planned. I've come to realize it's all chaos, interactions between the disparate desires of various individuals and groups randomly acting constructively and destructively with eachother creating random results.

Krebs is not wrong but he also doesn't get into how many things are "verified" through an even older mechanism, the LOA (letter of authorization/letter of authority), which is literally just an actual letter on one corporate entity's letterhead, specifying some CIDR-notation prefixes/ranges that are permitted to be announced by that entity's upstreams, signed by a corporate officer. Often in the form of a raster scanned PDF.

BGP4 in general dates back to a time when everybody trusted each other and the global routing table/network engineering community was a much smaller thing. We've been trying to glue better things on top of it for 25+ years.

I have kinda wondered how robust Internet would truly be if the proverbial stuff really hit a fan. I mean proper nuclear wars or economies including large ISP going down with admins dying or being fired on left and right. Specially with all of the talk of global crypto and so on...

As I understand it is continuously almost hand-tuned machine and enough misconfiguration or entities simply stopping to exist much of the transit capacity would be gone. Works well enough when it is not touched and there is someone on call to fix it, but specially if later was gone? How robust and self-correcting is it really? My guess is probably not at all...

It wouldn’t work. Just look at what happens during bad storms.

I've never know the internet to go down in a storm. Does that happen? I mean I've had the wifi router stop if there's a power cut and had to use cell data but that's about as bad as it's got for me.

I've been leaning on my app's users to do what they need to do on their end to implement offline/local first use of the app and they just do not get it. For them the only issues they've had were connection issue on their end with their service providers so they don't feel this is an issue of concern.

But I read stuff like this, and in this case it's Krebs, so I have to expect these kinds of issues will pop up. The article mentions the FB outage and most everyone on my FB feed was freaking out over not being able to access it, and for the most part it's not a critical service. And when they came back online some of the conspiracies they were sharing about what/why it happened were way over the top.

From my perspective it feels like everything on the internet is just one missed tap on a keyboard from breaking.

I am not a security expert, but this article seems to just be saying that the Internet is held together by trust, convention, and an ever-evolving set of technologies. In the case of Level3 (now Lumen), it seems they did not deprecate an insecure method that others already deprecated. And it seems that better technologies are on the horizon (RPKI) but not yet fully in use. To me this doesn't feel as bad as "spit and baling wire". We could be more secure by holding everyone to a stricter standard on adopting newer, more secure technologies. But is it really as broken as the title suggests? I don't think so.

You know what? I get the feeling that _everything_ is held together with only spit and baling wire.

I get the same feeling too. And maybe that's fine. We are led to believe that we are over-reliant on a particular things like the internet or the shipping supply chain but we probably are not.

Most modern, massive systems seem that held with only spit and baling wire. This might not be true for infra like roads, bridges and dams. And there are things, buildings and structures that are 10s or 100s of years old that still work. Power plants and cars for example.

Held together with spit and baling wire as it is, the fact that it mostly works proves that the overall architecture is robust.

> proves that the overall architecture is robust.

Not really. What it shows is the stark difference between two ideologies. The first camp contains people who believe in Postel's Law, "be conservative in what you do, be liberal in what you accept from others". The second camp has people who recognize that the current world is not a cooperative network of researchers: "all input is untrusted".

Krebs is absolutely in the second camp.

But the two philosophies aren't really in contention. Proper adherence to Postel's law also includes accepting malicious traffic (and then doing something reasonable with it, like black-holing it).

The "liberal in what you accept" part is mostly honest acceptance of the reality of the network: you cannot control the information sent to your service, only how you respond to it.

These are not ideologies, but compatible principles from separate concerns. One relates to protocol errors, the other to information validation & verification.

If Alan Kay is to be believed (which I hope he is!) then the internet was originally inspired by multicellular lifeforms.

I'd say the internet has some sort of a "biological" architecture. Robust in the sense that organisms are robust; extremely messy, sensical from a high-level view, chaotic from a low-level view.

That's an interesting point about how the Internet has a kind of "biological architecture", whose chaotic and self-organizing nature makes it robust and resilient.

It's the same quality that's claimed by proponents of "decentralized autonomous organizations" (DAO), though for that I'm not yet convinced of its practicality.

Someone on this thread mentioned the book, Antifragile: Things That Gain from Disorder.

> Just as human bones get stronger when subjected to stress and tension, and rumors or riots intensify when someone tries to repress them, many things in life benefit from stress, disorder, volatility, and turmoil. What Taleb has identified and calls “antifragile” is that category of things that not only gain from chaos but need it in order to survive and flourish.

This reminds me of the concept of "emergence" in living systems.

> the arising of novel and coherent structures, patterns and properties during the process of self-organization in complex systems


Every complex system - including the human body, the global economy, the ecosystem, and large software systems - grew one functional hack at a time

You’d be surprised what goes on in Level 0… who works there, how they work, the hour-by-hour. Even at a big company that handles most of the world’s credit card transactions at their biggest data centre, for example. It would absolutely blow your mind. It’s amazing the internet works as reliably as it does!

Is there a shortage of spit and/or baling wire?

What are some other/better proposals for how to organize a world-wide network? This is one area where I have not seen many articles. But admittedly, I'm probably not looking in the right places. Any suggestions?

You could ask Cloudflare or Google and they would probably say that they should run it all.

Which funny enough, are the only parts that really go down all at once.

When the internet has troubles it's a mistake of a giant centralized service.

Most of the time only that service is affected by their own mistakes, but sometimes that service hosts a lot of others, or is so massive that they cause DDoS attacks like when Facebook went down and their clients spammed DNS servers.

Seems the internet is fine. Centralized services not so much.

blockchains are one

Whenever people are predicting the end of the world because of some political or cultural upheaval, I think about the internet, or airport security. There are really simple ways that any idiot could totally fuck up either of them and cause catastrophic problems. But it doesn't happen. What that shows is that the potential for catastrophe has nothing to do with catastrophe actually happening. Even if the world could fall apart around you at any moment, it's probably not going to.

Value less comment... but I've always had a thought in the back of my head that it will only take some catastrophe for us to be back in the dark ages. I hope I'm wrong.

As a less all-or-nothing scenario, one relevent concept comes from Joseph Tainter as a theorist of collapse. He proposes that eventually complexity reaches a point where it not sustainable (either in general or at the margin), and a society collapses back to a configuration of less complexity.

> However, the additional integrity RPKI brings also comes with a fair amount of added complexity and cost, the researchers found.

I wish our government spent billions on funding open source software to reduce that complexity and cost, instead of on introducing/hoarding vulnerabilities.

This is an appropriate task for government funding. (even better yet, multi-lateral/international funding). Nobody else is going to provide the resources to get it done. Clearly.

reminds me of the old email interface run by network solutions when they were the sole registrar for everything that wasn't government.

Some of this has to do with IPv4 and massive carrier NAT schemes. If technical people could get an ASN easier and route with each other on IPv6 it theoretically could provide backup paths for neighborhood level traffic. I guess the transit/ISP would have to enforce something like BGPsec with these little peers to prevent mistakes or malicious ASNs.

Come on, who in 2021 actively drops routes because of an entry in the Level3 IRR? I mean, IMHO, it's not close to impacting the whole Internet.

Could the -MAIL-FROM requests somehow be tunneled through a VPN so that they are only accepted through that VPN and not the wider internet?

Yes, but why did he only realise this on November 26th, 2021 AD?

I'm a bit shocked that MAIL-FROM auth was ever accepted, let alone until 2012. Even the other auth methods via email seem somewhat dangerous, though I sincerely hope these registries follow extremely strict policies for key management.

That's how validation of TLS certificates and domain registration still works.

Of course FidoNet is much better.

BGP is how the most critical infrastructure on the Net communicates and BGP4, the current version, is 15 years old.

Age, alone, doesn't make something bad or unreliable.

...and routers

And Google requiring (costly and/or time consuming) SSL certs to be applied on all sites to "ensure security" was also a big industry money making nightmare for many independent (non-income-driven) sites that is still playing out badly, and not providing much more security.

Two factor authentication and account verification is really an elaborate corporate sham to get people's phone numbers and PII for free. It doesn't do anything new or good for consumers in terms of security over time. There, I said it.

I prefer the old Internet. All these new fangled "fixes" are only makin it worse, more expensive, and overly complicated. :/

Deprecating unencrypted HTTP is a big systemic improvement even though some individual sites may not benefit much. It's a network effect. (What's the money grab given free let's encrypt certs?)

Lets encrypt from what I understand require time consuming updates every few months. My host provider also does not allow me to install them manually, further complicating the process, and conveniently they sell certs for $125 a year... Per site. It's been a thorn in my side because we're too big to easily move now.

You are absolutely not meant to do the updates manually.

On one ISP that I host sites on, they restrict cert installs and don't allow SSH access. It's done in order to sell their cert services. I have too many sites on there to move easily... It's complicated. Eventually I'll bite the bullet and move to a new host. :/

Sorry about your service provider failing at their job and squeezing you for $$ cert services! But I'm not nearly convinced this is big enough to stop encrypting the web.

Setting up, monitoring and maintaining LE isn't free

It all happens automatically after a setup process that takes less than a minute.

I’ve spent hours fixing, debugging and upgrading LE clients this year. “I’m doing it wrong” I’m sure

Whether or not that's true, it's not a money grab.

But monitoring and maintenance are things someone needs to do if they operate a site, period.

But if you're independently running, paying for, and managing multiple sites, it's a HUGE burden. It also kills innovation for independent devs and startups, and dramatically raises the cost/investment threshold for this kind of innovation.

Pricing on cert services is also far too high when everyone's concern and agreement should be security as a basis for operations. It's not something that should be an upcharge or income opportunity.

You buy a door lock for your home once, and it works as long as you don't compromise the key. If you buy a house, door locks are expected to come with the house in most circumstances.

Having just replaced my door lock, I can assure you that they too wear out and need replacing. (one of the springs inside broke)

The pricing on Let's Encrypt is literally zero, and they provide (also free of charge) the `certbot` utility which you can run as a cronjob and which will automatically renew your certificates for you. The whole thing comes extremely well documented and with install scripts that take less than a minute to download, verify and run. If you think even that is too much of a burden I don't think any topic in programming is simple enough.

And, indeed, if you build your site via a service provider or platform, an SSL solution is usually provided.

Building a site from scratch in this day and age is a lot more analogous to building your house from scratch. Nobody to blame but yourself if you buy substandard locks and thieves get in. Only here the metaphor breaks down, because if you aren't encrypting your HTTP traffic and it is intercepted, it's your users who suffer, not the site owner.

I, too, pine for the days of simpler internet. But that was a function of the user base, not the technology. It was always insecure... it simply hadn't been exploited yet. Now that it has, and is, site administrators owe it to users to secure their connections.

traefik takes care of all of this with about 5 lines of setup. it's so trivial i add it to every experimental nonsense service I setup because it's one line of nix config. i really don't understand the complaint.

Setting up certbot is easy, not a big burden for indie devs. Or if you want to know nothing about tls & certs, just get hosting that comes with tls.

I wasn't writing to the update process, as much as the original installation of a cert.

On a house you own, you can change locks and keys any time you want to keep security up to date (for example).

no house in "move in ready condition" comes without sufficiently keyed door locks of some kind (on day1).

I’ll reply to this and some of your other comments in this reply.

In a lot of cases, SSL is not expensive or time consuming. It is a single line in cron. I appreciate that this is not the case for your hosting, but economic pressure is one of the main ways SSL can be more utilised. The fact that you’re considering moving away from them, suggests that their business will suffer in the long term, if they don’t make integrating SSL easier/less expensive. This is good economic pressure, and its likely the best pressure that can be applied right now, considering the glacial pace of technology laws in almost all countries. You seem to be generalising your situation and applying the blanket “it’s too expensive” argument to everyone, even though it’s mostly a non-issue for people who have better hosting providers or not as much legacy.

Arguably, building a website with a login is a LOT easier and cheaper now than it was 10 years ago, because Let’s Encrypt is such a well known option. If they wanted to do so 10 years ago, they would have most likely had to pay through the nose for an expensive certificate. You seem to also have forgotten about these people with your blanket statement about hosting websites being more expensive for everyone.

Is the security provided significant in simple sites? Probably not. However, having SSL be a default is good overall. It gives less chances for operators to screw up because non-HTTPS raises very user-visible alarm bells. If your site is small and non-revenue generating, then why does the security alert even matter? It doesn’t prevent anyone from accessing the website.

Your 2FA argument is wrong. Sure, there may be multiple reasons for mandating it, but for regular users, 2FA is good defense in depth, that offers protection against password compromise. Again, the average consumer doesn’t necessarily have strong passwords or unique passwords across services. 2FA is good protection for them.

Also, if mining user data was the main reason for 2FA, big tech wouldn’t support hardware security keys for 2FA. Mobile 2FA is a usability compromise because it targets a lowest common denominator that (almost) everyone has.

Sure, certificates can be time consuming at the moment but that will only get easier. Just like hosting the underlying website.

The number of sites that should have had SSL but didn't was laughable and justification enough for browsers to require SSL.

I don't know if you're being deliberately alarmist, but 2FA is a huge peace of mind when done correctly with one time codes. Those don't require phone numbers and is the properly secure method.

Sure the old internet was a bit more fun and carefree, but it became far less fun when you had your online accounts compromises because of weak or non existent security.

Honest question (IT/security noob) -- why does it not provide that much more security? I like verifying that my traffic is going where I want.

With Encryption being applied to every site as a requirement is relatively new since google made it a requirement in Chrome.

Previously it was only required for secured transactions like purchases and working on health care records etc... And very rightfully so.

Now Google Chrome flags even simple (informational) sites for not being encrypted, and (quite possibly) rightfully so because of the potential for tracking/abuse, but adding encryption to a site is costly for independent sites (not hosted on social media or corporate platforms like blogs etc...

You shouldn't be required to encrypt a baking recipe site if you don't want to... Ultimately laws should discourage data abuse, and/or encryption should be inherently provided for every site/app uniformly by all web host providers (natively and inherently, and at a far lower price than it is now, generally speaking).

Too many people are running widely varying encryption measures, and implementing security in too many different ways to ensure that it is stable across the Internet. Security is best when it is uniform, fortified by rules and regulations, and updated ritually.

> Security is best when it is uniform

So when web traffic is uniformly not encrypted that's more secure than if it is encrypted by varying degrees and implementations?

Tbh your complaint reads along the lines of "Perfect is the enemy of good enough". SSL may not be perfect, but it sure as hell is better than running pretty much the whole web in the clear.

Particularly as pretty much all of your complaints do not really have anything to do with SSL itself, but rather in how Chrome surfaces a lack of SSL a certificate and how your hoster handles installing certificates.

Both of which are things you can personally change something about.

>Both of which are things you can personally change something about.

No, because "how Chrome surfaces a lack of SSL" is about your users' Chrome, not your own instance of Chrome. You cannot personally change the Chrome that is run by your users.

'spit'...nice way to put it I guess, Krebs still living with his mom init?

The term "spit and baling wire" is a well known (US/AU rural at least) expression to mean something that is working well enough but is fragile.

I never knew! Excuses for the terrible joke

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact