To be clear, you blame CrowdStrike, Windows (??) but not the companies who picke...

nerdjon · 2024-07-20T23:50:01.000000Z

Most of the blame here falls on Crowdstrike. Both from a software standpoint that it can cause a BSOD so easily and not be able to handle something like this happening. But also whatever failure happened to let that file get out.

Some, minor, blame falls on Windows due to its ability to BSOD as easily as it does.

As far as the companies, it is a tricky situation. Many of the companies have Crowdstrike enabled and automatic updates turned on to check some audit box. They have to keep the updates going out regularly.

We are well past the point in tech that a company is solely responsible for their systems with external dependencies being the norm. Either with the shared security model with cloud services like AWS or a reliance on external API's and servers. You have to trust the vendor you are working with for whatever critically important system is going to do their job. Could you look back and say that maybe you chose the wrong vendor for a specific piece of software, but this could have happened to other vendors.

Something that I am not entirely sure of is for those audit, compliance, etc requirements can they use an alternative update method. And this is something that would be different based on each compliance, but to the best of my knowledge for security software most want you to have automatic updates.

If this was the case of all of these servers going down because of a major AWS outage would you really be saying the companies are to blame?

kchr · 2024-07-21T13:34:31.000000Z

> Many of the companies have Crowdstrike enabled and automatic updates turned on to check some audit box. They have to keep the updates going out regularly.

While many companies probably do that, it's usually not required if you can argue for an alternative approach and how it fits your risk appetite better (e.g. progressive updates on a routine schedule).

belter · 2024-07-21T00:07:31.000000Z

> You have to trust the vendor you are working with for whatever critically important system is going to do their job

This is an absurd take, specially after an outage who took down 911 response centers, hospitals and has millions of passengers still stranded.

You trust no vendor and assume everything fails all the time.

nerdjon · 2024-07-21T00:19:25.000000Z

At some point you have too, you will never control 100% of the system between your servers and whoever or whatever will be interacting with it, and between your servers and whatever other services you have to work with.

There might be smaller parts of your system you could say this, but unless your system is 100% airgapped, and all of the wiring, servers, etc are all put down by you and you are working with a LAN.

There are not many systems that fall within that definition. As soon as you hit using the internet for communication you are reliant on your ISP working. Maybe you can have a redundant connection, but then you have to assume both of those will do their job and that they don't have a dependency that could bring them both down.

So no, it's not absurd unless you are never going to the internet. You have to make the decisions on what your system relies on and what it can handle.

I fully understand what this brought down, but again there are plenty of other instances where you assume an outside company is going to do their job.

Looking back and saying, well maybe this was a bad idea because its an external dependency isn't helpful when we can point to any number of other external dependencies that may not have brought down as many systems but can just as easily bring down critical systems.

belter · 2024-07-21T00:25:34.000000Z

I still don't see your point. I am responsible for my systems not other vendors.

- You need more than one ISP

- You need diverse Operating Systems and Databases

- You deploy in phases with canary releases

- You don't deploy on Fridays....

How difficult can it be?

Khaine · 2024-07-21T13:16:10.000000Z

Let's be clear, this wasn't a new version of Crowdstrike. Admins can control version updates, and have a policy if n-1. This was a channel update (similar to antivirus definitions). AFAIK you cannot control channel updates.

This is entirely on cloudstrike, or perhaps clown strike is more appropriate.

nerdjon · 2024-07-21T00:31:24.000000Z

> - You need more than one ISP.

I addressed this in my previous response. It is still an external trust, even if you have redundancy.

> - You need diverse Operating Systems and Databases.

I have never ever seen a company run the same server side software deployed to multiple different operating systems.

> - You deploy in phases with canary releases.

As I mentioned in a previous post, there are going to be critical enough systems that may be under a serious threat of breach that any wait is not worth the risk.

Also as I have already mentioned, in many cases automatic updates is turned on for compliance reasons that may not allow what we think is common sense for the vast majority of software.

> - You don't deploy on Fridays....

I agree but to the best of my knowledge this was essentially a security definition updates not a code update. That is the kind of thing that you would push out when you have it otherwise your systems could be vulnerable over the weekend.

belter · 2024-07-21T00:48:28.000000Z

> there are going to be critical enough systems that may be under a serious threat of breach that any wait is not worth the risk.

Disagree strongly. You are analyzing risk the wrong way. That is what I call: "Security by being on the latest patch"

Zero days occur every day and many are ongoing right now. Your antivirus vendor or OS vendor, needs hours to days, to weeks, to detected them, understand the attack, come up with a defense, test (hopefully...) the defense patch, deploy in phases (hopefully). So you are always many hours to days behind the latest threats and before getting such a protection.

The core idea here is "Critical System"

If the system is critical, it's security and robustness needs to rely on it's security architecture. Not "being on the latest patch". You will always be catching up to any threats.

nerdjon · 2024-07-21T01:02:40.000000Z

How is "being on the latest patch" (security definitions), not part of the security architecture? No where am I implying that it is the only part of security.

Also you are still ignoring, that for many of these companies they have not have a choice due to compliance requirements.

That being said, so great maybe we can avoid this issue. But instead maybe next time instead it will be. "Well, you run security software X and when you were breached they had a protection out for this, why were you not up to date?"

The fact remains that what happened yesterday was an extraordinary situation that I highly doubt anyone seriously thought it was a serious risk. Since most people would safely assume that a vendor pushing security updates would do basic testing.

Also you are focusing on security when there are other dependencies that could bring down your system. That is my point here. We are focusing so much on how this one thing should have been done differently and that the companies are somehow to blame when this could have been any number of other things that would not have been as global of an impact but could still bring down major systems.

belter · 2024-07-21T01:19:45.000000Z

You are completely ignoring the fact that some countries, some airlines and some 911 centers, many hospitals were not taken down. The reason? The diversity and phased deployments I am arguing for.

> Also you are still ignoring, that for many of these companies they have not have a choice due to compliance requirements.

They have a choice. They could run their system properly. You are arguing for reasons of compliance...When this incident is the clear demonstration being compliant has nothing to do with being secure and robust.

nikau · 2024-07-21T05:24:19.000000Z

Welcome to new generation "cybersecurity" experts that just regurgitate buzzwords like "compliance" and "guardrails" in addition to filling out risk matrix spreadsheets.

Its all PaaS/SaaS now, old-school properly engineered isolated solutions require too much expensive staffing.

I'm waiting for a vendor like zscaler to be hacked - what could go wrong with having thousands of companies do MITM SSL interception via a single vendor.

That's a nice juicy target for hackers if I ever saw one...