Hacker News new | past | comments | ask | show | jobs | submit login
SolarWinds: What Hit Us Could Hit Others (krebsonsecurity.com)
196 points by parsecs 4 days ago | hide | past | favorite | 150 comments

Oh good. The attackers were only in their systems since 9/4/19 before being detected on 12/12/20, so only 15 months of infiltration into SolarWinds' systems before detection. At least the payload was only deployed 2/20/20, so their customers were only completely infiltrated without detection for 8 months. Assuming the attackers could only get a 10 MB/s channel total per target even though they probably infected thousands to tens of thousands of machines per target, at ~20 million seconds that would constitute ~200 TB exfiltrated per customer or ~19 years of 1080p video.

If an attacker has just one day to root around and exfiltrate they can easily get valuable information. If they are given 8 months they have already gotten everything of value for months and are just waiting around for any new data to come in. Think how inadequate your systems must be to let an attacker sit around in your systems for 8 months, it is mind-boggling how unqualified their systems are for their problems. And this is not just an indictment of SolarWinds. Just in case anybody forgets, it was the top-flight security company FireEye who discovered this breach after realizing they themselves were breached. A "best of the best" security company took 8 months before realizing that they or any of their customers had been breached. This is what "top-class" security buys you.

The lesson isn’t that any particular victim sucks at security, it’s that well-resourced targeted attacks are generally unstoppable.

I think you can put things as modern baroque software stacks with their effectively vast "attack surfaces" are not going to stand-up to a well-financed, patient, skilled attacker.

I recall that many companies have switched from a perimeter model of defense where systems are secured from the outside to a "defense in depth" model where each system is secured on it's own (plus the perimeter).

Perhaps folks should think about tightening the in-depth model and avoiding the consumer model of constant updates from a zillion providers. Or perhaps a single lab could verify updates of the zillion providers rather than leaving them on their own.

> I recall that many companies have switched from a perimeter model of defense where systems are secured from the outside to a "defense in depth" model where each system is secured on it's own (plus the perimeter).

Yes, this is the current bleeding edge of cloud security, known as “zero trust.” Amongst other things, it usually involves provisioning mTLS identities in a secured manner to each service, with connections restricted to a whitelisted set of identities.

I found Evan Gilman and Doug Barth’s “Zero Trust Networks: Building Secure Systems in Untrusted Networks” [1] a pretty helpful read in understanding what modern/next-gen cloud security looks like.

Some modern implementations of varying depth and scale include SPIRE [2], Tailscale [3], and BeyondCorp [4].


[1]: https://www.amazon.com/Zero-Trust-Networks-Building-Untruste...

[2]: https://spiffe.io/

[3]: https://tailscale.com/

[4]: https://beyondcorp.com/

I really wish https://news.ycombinator.com/user?id=nickpsecurity was still around here to throw his weight in this thread. He'd often geek out in those kinds of discussions, talking how in the '70s people were actually doing security in computing, vs. the current security theatre, and how many things are totally possible to do, just no big company wants to do this now (even though loudly claiming otherwise), because it is expensive, and it's more profitable to focus on profits than security. For some technical references from Nick, see e.g.: https://news.ycombinator.com/item?id=20946537

For example, quoting verbatim from the link above:

> [...] Modern OS's, routers, basic apps, etc aren't as secure as software designed in 1960's-1980's. People are defining secure as mitigates some specific things hackers are doing (they'll do something else) instead of properties the systems must maintain in all executions on all inputs. We have tools and development methods to do this but they're just not applied in general. Some still do, like INTEGRITY-178B and Muen Separation Kernel. [...]

This is completely rose tinted glasses.

There have always been lots of techniques that people claimed as magic bullets for security. They were there in the 1960s-80s, and there still are.

And yet Modern OS's, routers, basic apps are much more secure than the equivalents then.

I mean in the 1970s, there was the Ken Thompson backdoor and no one knew about it until he disclosed it in his ACM Awards acceptance speech[1].

Nowdays there are provably secure kernels[2] in real world deployments, and high assurance kernels[3] in billions more.

[1] https://www.win.tue.nl/~aeb/linux/hh/thompson/trust.html

[2] https://ts.data61.csiro.au/projects/seL4/

[3] https://en.wikipedia.org/wiki/L4_microkernel_family#Commerci...

I miss his comments as well.

Yeah, but if you're blindly installing a third party's binary blob, it's hard to call that "zero trust".

Edit: It seems like a serious extension of the zero trust concept would involve something like "only allow in source code into the system from sources we trust and then we compile it ourselves". Limit trust to trusted identities and don't allow binaries in any more than people.

> Yeah, but if you're blindly installing a third party's binary blob, it's hard to call that "zero trust".

No, "zero trust" means you install it (or anything) but assume that is is malicious so take proactive steps to limit its access within your network.

"Zero Trust" means assuming everything (including your compiler) might be compromised, so try to limit the damage any one incident can cause.

Of course, SDLC security and binary verification is also part of zero-trust systems.

But you have to keep in mind - the whole point of zero-trust systems is that even if a malicious third-party blob is present on some nodes, it's blast radius is limited by the whitelist of clients that it can communicate with, which makes exfiltrating information particularly difficult unless you have additional exploits that let you move laterally.

I am suggesting you could limit "binary verification" and just allow "sources you compile yourself", 'cause how much can a binary be verified? (enough to know your blob was hacked at the source and not along the way, yeah but as we see, that's not enough). What describing just involves "we make sure our vendors are sort of trying to be secure" and that's about it 'cause that's all you can be when someone hands you a binary.

And sure, there exist things like Linux drivers that are binary blobs in source code form but at least directs one's eyes to such things.

Edit: Also, even in these environments, some people and some software has to run the network itself, like a sys admin tool. Which makes sys admin tools something everyone should look at. And brings us to the present hack.

How much do you trust your compiler?

I've been asked to look into ZTN just today by a large government customer.

The problem is that network security is a tiny, tiny subset of security in general. You can throw SAML, mTLS, or IPsec at your connections all you like, it would have done exactly nothing to stop the type of attack that hit SolarWinds and their customers.

Let me reiterate this: ZTN would have achieved zero protection.

The problem in general is that there are conflicting interests between what the developers want to do to reduce their work effort and operational application security. Even large vendors are hopelessly bad at producing secure and securable software. The documented and recommended easy path is not the secure one. The products are incredibly hard to secure from the perspective of some poor ops person.

Some random examples:

- Most vendors document their firewall ports, which is great! But not their direction or what roles they apply to.

- Very few vendors provide machine-readable firewall definitions that can be imported into firewall systems automatically.

- Vendors like AWS and Azure that do publish JSON lists of their endpoint addresses and ports use unique, non-standard formats, so very little effort is saved for the end-users.

- Public cloud vendors especially end up "blending" multiple customers into pools of shared IP addresses for all of their PaaS or SaaS services. This is a disaster for security, because it makes it virtually impossible to put a WAF or a WAP in front of certain services.

- mTLS support is a shitshow. Manually uploaded certificates that expire and break everything annually is the norm. CRLs and OCSP are almost never checked. Many products support only a single certificate, and will even crash if talking to products that do support multiple certificates and are in the middle of a rollover. Alerts and visibility of these certificates is poor. Secure certificate storage (i.e.: in a TPM) works better on laptops for securing WiFi than it does for servers processing billions of dollars in transactions.

- The rise of Docker, NuGet, Cargo, and NPM are a security disaster. Incident after incident and warning after warning have been ignored. Why? Because these technologies are soooo very convenient for developers, and they're the ones making the decisions to use these package management platforms. The ops people are then left with the mess. Have you ever tried patching a vulnerable version of .NET Core in a Docker container from a vendor that doesn't even exist any more? Good luck with that.

- Similarly, the plug & play nature of the public cloud with the extensions, plugins, agents, and marketplaces isn't a disaster waiting to happen. It did happen. That's what this SolarWinds thing was all about.

- Pervasive telemetry is largely indistinguishable from data ex-filtration. Not only does it pollute the logs, but vendors are deliberately obfuscating the traffic to work around customers blocking it. Did you know that Microsoft used "eu.vortex-win.data.microsft.com"? Did you spot the deliberate typo? It took me a while to realise why telemetry traffic was still getting through despite firewall blocks on "*.microsoft.com"!

- Back doors in products for "support" are another security disaster waiting to happen. Several SAN array vendors for example include these as standard now.

I could go on and on.

I'm watching some of our larger customers struggle with security, and to be honest it feels hopeless. These orgs have 10K+ endpoints, 2K+ servers, running tens of thousands of distinct pieces of software from at least a few hundred, maybe a few thousand orgs.

For every step forward in security, there's two steps back for someone else's convenience.

> The problem in general is that there are conflicting interests between what the developers want to do to reduce their work effort and operational application security.

The most secure work is the work that doesn’t get done.

Most Enterprises actually use the typical 3-tier bullshit that Cisco put into their Campus network design for everyones global operations. And then you Accenture, McKinsey, Oracle and everyone else advocating for this kind of design. All the different departments poking holes everywhere because they actually can't work. They can't access compute, they can't access storage. It takes 3 months to upload basic data somewhere.

The whole thing is a cluster fuck. I keep getting hired as a person to help retain talent, but enterprises like to put all their cards on Solarwinds. When they can blame Solarwinds, their managers are still safe. There is no compartmentalization of access. You don't need to have all departments globally be in the same LAN to begin with.

This is the real issue.

We can hire all the "security experts" in the world, but if software and architecture are fatally flawed from the getgo, you can check all the checkboxes, and scan and produce all the pretty reports you want.

The end result is still paper mache

The only way out of this mess I see is throwing everything away and starting from scratch, using provably (machine verifiable proofs) correct software.

I'm thinking of properties like "for any api input the reply is independent of $SECRET".

If you started today, 50 years from now you may be able to run vim on such a system, but not sooner.

Original Unix was created in a couple of years. While your point has merit, you’re overestimating by much.

Original Unix wasn't formally verified. seL4 took ~8 years from initial design to release (though apparently the verification was done after 3 years?), and it is a code base smaller than TypeScript. CompCert started in 2005 and it still can't compile all of C, and it doesn't go past the level of Gcc's -O1 optimizations.

Formal verification is an extreme cost amplifier, and the cost is super-linear with the size of the code base. I don't think my estimate is unrealistic, depending on exactly what you would expect to have running before vi (I was assuming a complete OS including a kernel, device drivers, a C stdlib or equivalent, and a userspace similar to the GNU tools).

I'm always amazed that the top-tier security firms use... Linux. And the Militaries of the world use Windows (like come on). You'd think there would be some crazy million-dollar-per-license impenetrable operating system out there. The race to the bottom is palpable.

The consumer model got where it is because leaving software proven vulnerable in place is worse than the risk of a potential supply chain attack.

True, but it is important to quantify that inadequacy. SolarWinds was actually attacked by an actual threat and their defenses were comically outmatched by that real threat who found real value in attacking them. We are not talking 10%-20% or even 100%, we are talking systems that need to improve by 1,000%-10,000% to provide credible defense against real foes. And this is not just SolarWinds, FireEye, a major cybersecurity company, needs to improve their security by a factor of 100x to protect their customers against people who actually wish to attack them. The security is not merely inadequate, it is inadequate to a degree that is almost mind-boggling. Systems are being deployed that are not even 1% of the necessary level of effectiveness. These organizations are completely incapable of developing and deploying adequate defenses.

This ignores the secondary problem which is that if the attacks being deployed are 100x stronger than the defenses, how hard is it to develop an attack that is merely 2x stronger than the defenses. If we lazily extrapolate this linearly, that would be 1/50th the resources to develop an attack that still outmatches the defenders. How many people do you think were on the SolarWinds attack? 10, 100, 1000? Even at 1000 that means you would only need 20 fulltime people for a year to develop a credible attack against nearly every Fortune 500 company in the world and most of the US government. That should be a terrifying concept. Obviously, this is lazy extrapolation, but it is not so off as to be non-illustrative of this problem.

Given this immense difference between real problems and available solutions, the only reasonable assumption for companies and people is to assume that they are completely defenseless against such actors and likely even largely defenseless against even poorly-resourced attacks as demonstrated time and time again. It is imperative that they act accordingly and only make systems accessible where the harm of guaranteed breach is less than the benefit of making those systems accessible.

I don't understand how you are measuring "security" quantitatively to say "1,000% - 10,000%" and "100x".

But yeah, security is in a terrible spot, throughout the software industry.

It is not clear to me what the steps would be to improve it. It is not clear to me whether these steps would increase software costs (something that is quantitative) by "1,000% - 10,000%", and if so how society as a whole could possibly afford it, in this society where we have put (insecure) software everywhere.

Obviously it would depend on the situation, but in this case we could probably use a metric of something like mean-time-to-discovery which is probably a useful metric for evaluating intrusion detection. Making simplifying assumptions, we can just model this as a probability to discover per unit-time with percentage improvements reducing mean-time-to-discovery. In this case we have ~8 months to discovery or ~240 days.

We then need to determine what we deem adequate for intrusion detection. I contend that a competent intrusion extracts the vast majority of its value within 1 week. This is consistent with the results in the Verizon Data Breach Report from 2019 [1]. Given this, we would logically want to detect an intrusion in less than this time otherwise the vast majority of the damage has already been done. By this metric, 1 week or 7 days versus 240 days would require a 35x or 3,500% improvement in detection by FireEye to be able to detect the intrusion and materially impact the value gained from the intrusion.

In addition, this is not materially different than the standard in the industry as seen in the other graphs on that same page which shows that detection normally takes on the order of months which is far longer than the standard for exfiltration which is days to weeks.

I do not contend that all software everywhere must be made secure against all attackers. However, it must be evaluated against actual expected attacks with a proper understanding as to what it can actually defend against otherwise it is impossible to make an accurate cost-benefit analysis. It is up to the creator/user/people to make an evaluation if the benefits outweigh the risks. That is how you define adequacy.

[1] https://www.nist.gov/system/files/documents/2019/10/16/1-2-d... Page 10

Or, this is a desperate ploy by SolarWinds to cover up and make excuses for their incompetence in the hopes that the market doesn't give their company the death penalty.

Like sure, securing yourself against a government is hard. But when a security researcher told solarwinds an update server was accessible with the password 'solarwinds123' and they sat on that report for 3 days [1], my confidence levels that they were generally competent are not high.

> Security researcher Vinoth Kumar told Reuters that, last year, he alerted the company that anyone could access SolarWinds’ update server by using the password “solarwinds123” [1]


> When reached for comment by Newsweek, Kumar forwarded his email correspondence with SolarWinds. He first notified the company of the issue on November 19, 2019. SolarWinds' information security team responded a few days later on November 22, 2019. [2]


> Others - including Kyle Hanslovan, the cofounder of Maryland-based cybersecurity company Huntress - noticed that, days after SolarWinds realized their software had been compromised, the malicious updates were still available for download. [1]

Did the company with the above security practices get more important stuff right? Like sure... maybe? But probably not.

[1] https://www.reuters.com/article/global-cyber-solarwinds/hack...

[2] https://www.newsweek.com/solarwinds-update-server-could-acce...

May I propose a corollary: we all suck at security when confronted by a well sourced adversary.

Very much so. The description of the staged attack suggests months of work by a sizeable development team.

No mention of using an "air gap" in any of the comments here.

Air-gapped networks are a pain to work with but are incredibly effective. I still haven't read anything on how the attackers actually reached the build server though, so it's hard to say if it would have worked in this specific case. I wonder if the lack of a clear statement of how they made the initial breach is due to

- bad searching on my part

- they still don't know

- they're still trying to mitigate it

- they're still trying to word it in a PR-approved way

An alternative to full air-gapping is gateways that pass only the traffic they're supposed to.

One example: For many years before they were bought by HP, Compaq's mail system had a regular Internet-connected SMTP MTA that fed two back-to-back hardwired uucp MTAs that connected to another SMTP MTA that fed their internal network. This was the early days of the public Internet, and they didn't want to have to trust that their SMTP MTA was secure, so they just made sure nothing but mail could ever get through their mail pipeline.

It's awkward and a bit of a pain to set up and manage, but enterprise companies have the resources to do things like that. And it did actually provide state-of-the-art email security for its time.

(FWIW, I didn't work with Compaq, but compared notes with their network architect on an occasional basis...)

Whenever I read a post as strongly worded as yours I wonder that its author must really be the best security minded engineer out there, and has never compromised on anything for a deadline or missed something because they simply didn't know it should be configured in a particular safe way.

This definitely doesn't look good and there's probably many failures along the way, but jeez.

You do not need to be the best civil engineer in the world to recognize that a new bridge design falling on its first day is a totally inadequate bridge. Similarly, you do not need to be a security expert to recognize that 15 months of active undetected infiltration demonstrates a nearly comical inadequacy. To provide real material mitigation you would likely need to be in the hour to day range at most to actually stop a double-digit percentage of value-weighted exfiltration. This even ignores the cases where they just want to damage your systems which would generally only take minutes.

Now maybe nobody can do that, maybe these systems are the best available, but even if that is true that does not suddenly make them adequate. Adequacy is an objective evaluation of reaching a standard, if nobody can reach the standard, then nobody is adequate. We would not let somebody use a new untested material in a bridge if it can not support the bridge just because "nobody knows how to make a bridge that works with the new material". And, by any reasonable metric, an inability to recognize active infiltration for months indicates that against a threat actor similar to what attacked them, they are about as adequate as a piece of paper against a gun.

Terrible and dishonest analogy. The very reason this went undetected for 15 months is because the bridge _didn't_ fall down. There were no signs of a "break in" and it's wholly improper to compare a virtual system to a physical entity like that in the first place.

It was a catastrophic failure by the infosec engineers involved. (And by extension, their / our common processes.) Sure - the bridge metaphor doesn't fully land because a collapsed bridge is visible immediately. But the security failure is worse because of how long it took to become visible.

I think the people defending the engineers involved have a mistaken idea of what the responsibility of the security team is. Their job description is not "follow industry best practices" or "look for signs of a break in using their tools". Their job is to keep their company and customer's data secure. At this job they failed.

I probably would have failed too, so I have some sympathy for everyone involved. There's an open question of how we engineer our systems to make sure this never happens again. But none of that changes the fact on the ground that the security teams involved failed their responsibility to their businesses.

> Their job description is not "follow industry best practices" or "look for signs of a break in using their tools". Their job is to keep their company and customer's data secure. At this job they failed.

No. Their job is to use the resources they’ve been allocated to manage risk within the organization to the risk level senior management has agreed to accept.

Maybe that shouldn’t be their job, but it is. There isn’t a security team on the planet that gets to dictate security to the rest of the organization or unilaterally make decisions that influence the running of the business to achieve the level of risk mitigation that they themselves would prefer to attain.

That is putting aside the fact that defending against a determined nation-state adversary is a nigh-on impossible task that would require countermeasures like ‘don’t connect to the Internet at all’, ‘don’t hire anyone you haven’t personally known since childhood’, ‘hand-deliver your product to your customers’, and other equally impractical mitigations.

Nobody outside of Solarwinds knows if their security team succeeded or failed in the mission they were given.

I hear what you're saying and I appreciate that perspective. I agree, and there's also something slippery in that argument taken to its extreme that we need to be careful of.

Some people here on HN made the same argument when Equifax leaked personal information about millions of americans. They said it was ultimately management's fault and not the engineers' fault for not allocating enough resources to security. And the same argument was used by the engineers who made the Therac-25 radiology machine. In that case, software bugs resulted in a handful of deaths due to lethal radiation.

Upper management can't be responsible for everything that happens in their business. Engineering isn't their job or their expertise. Thats why they hire software engineers and security engineers - to be the local experts. We need to bear responsibility for the decisions we make in our field. And engineers have a duty not just to the companies we work for, but also to society at large. If we leave our personal judgement at the door in the morning, we fail in our duty to society.

To go back to the bridge metaphor, if a bridge falls down, its not good enough for the civil engineers involved to blame management for not giving them enough time / budget / whatever. They also bear some responsibility for the disaster. This has been enshrined in case law too, at the Nuremberg trials. "I was just doing my job" wasn't considered a good enough excuse for the guards in WW2 concentration camps. These are big examples, but I think the principle is fractally true.

And the inverse also holds. Praise and blame go together. The biomedical engineers in the labs also deserve praise for the covid19 vaccines they've invented, even if upper management told them to do it. We aren't management's slaves.

The difference, when it comes to civil engineers and building bridges, is we as a society have recognized their expertise and made it illegal to build a bridge except as designed by a civil engineer.

A more apt analogy, in my opinion, to the day to day realities of managing production applications and infrastructure is the regulation surrounding the maintenance of certified aircraft. There are minimum competency standards that are enforced by law, it is unlawful in almost all circumstances for a non-certified person to perform any maintenance or repair on a certified aircraft, and, crucially, an aircraft cannot return to service unless a certified mechanic signs off on the repair. Not the CEO of the company that owns the airplane, not some middle manager, only the expert (mechanic and, sometimes, inspector) can sign off on returning the plane to service.

Without that kind of legal cover, management can and will steamroll over anybody who is impeding their initiative of the day.

Sure; but politicians don’t know anything about technology. They usually don’t even decide what’s right and wrong. They take what culture has decided is right and wrong and codify it in law. The law is a trailing, not a leading indicator of ethical practice.

Do you think planes were falling out of the sky left and right before those air safety laws came into effect? No. The engineers at some companies pushed for sane, safe practices first. Later they were adopted by the industry and later still they were enshrined in law. Before those laws were passed, airlines still had a duty of care to their passengers, ethically and (I think) legally.

Likewise it’s up to us to decide what sane, secure software engineering looks like. Not politicians. Not management. It has to be us. Nobody else is qualified to make those choices. At some point those ideas might be codified in law; but we need to figure out what that looks like first. (And to be clear what you’re arguing for - imagine the reverse. Imagine if inventing security best practices was outsourced to politicians!)

The idea that management should feel free to steamroll over their own employees’ judgement for the sake of the initiative of the day is toxic. And that’s exactly the sort of work culture which creates global security issues like this one. Of course a balance has to be reached, but you don’t do anyone any favours by being management (and the law’s) highly paid keyboard.

> Do you think planes were falling out of the sky left and right before those air safety laws came into effect? No.

That is exactly what was happening. In 1924, prior to the introduction of the first federal aircraft safety regulations in 1926, there was 1 fatality per 13,500 miles for commercial flights. Between 2000 and 2010, the average was 0.2 fatalities per 10 billion passenger miles.


Thanks for those numbers!

Imagine yourself as an aeronautical engineer around that time. You have a sense of what good safety practices could look like - you’ve been to conferences and talked to your colleagues, and you have some thoughts yourself. But management at your airline doesn’t want to spend the money.

Would you argue for meekly going along with management’s choices, knowing those choices will kill people? I would say, if you did, you would have blood on your hands. We’re people first and employees second.

The stakes are lower and there’s a middle ground here. But you have a voice, and usually more power than you think. The siren song of dumping all responsibility for your actions onto upper management makes you into a victim and a child. It’s bad for society, usually bad for your company in the long term and bad for your psychological health and development. And a disaster for your professional development.

I don’t know if that lands with you, but it’s certainly a lesson I wish I could give to myself over a decade ago.

Keep in mind that although that regulatory certification approach does indeed increase security, it can also greatly constrain innovation. It's most appropriate once systems have matured a bit.

FWIW, there was an average of one steam boiler explosion EVERY WEEK in America (frequently with loss of life) when the ASME was founded to set standards for safe design and certification. So it can take considerable pain before efforts like take off. The FAA had the advantage of already having that kind of certification as an already established model, plus airlines were eager to have a stamp of safety approval.

It's hard to see how a "security certification" standard could really provide much assurance in today's world - witness the inadequacy of FIPS, SOC, the outright laughable HIPAA, etc. PCI is one of the only certs that really provides any kind of assurance, but it's driven by the banks that insist on it being there to protect themselves. And recent events have shown that we have way too much centralized control of electronic payments processing already...

Those regulations wouldn’t be needed if our industry could govern itself and act in healthy, responsible ways.

Unethical data collection leads to regulation, which leads to less innovation in the long term. Fight for ethical behaviour in your company and team and we can, en masse, delay the need for that.

And as for regulation, if it were up to me I’d make EULAs mostly unenforceable. Which would give leave for the people and companies affected by security breaches like this to sue anyone and everyone responsible. Which, by the way, is how the law is designed and how it works in every other facet of life. Sell a faulty ladder that kills someone? Get sued for negligence.

But senior management is responsible for establishing the resource budget and dictating the requirements. Given infinite resources, conceivably an engineering team could make an infinitely secure product. But nobody has that (except maybe the military). Like all other computer problems, this comes down to the constraints and requirements.

> "We would not let somebody use a new untested material in a bridge if it can not support the bridge just because "nobody knows how to make a bridge that works with the new material""

You might be interested to read about the Morandi Bridge collapse in Italy: https://www.engineering.com/story/italys-morandi-bridge-coll...

"When the Morandi Bridge was built, encasing a cable in concrete was innovative.", "the decision to pre-stress it was debatable", the concrete meant they couldn't check for cable rusting underneath, the concrete could have been supplied by the Mafia and under specification, "“We have used materials that are destined to deteriorate quickly, like those of the bridge in Genoa,” Bercich said in a post-collapse analysis".

"Settimo Martinello, director of bridge inspection company 4EMME Service, told CNN that there are “about 15 to 20 bridges collapsing every year” in Italy."

From another source: https://www.theguardian.com/cities/2019/feb/26/what-caused-t...

"In the 1960s little was known about the interaction of materials, or the effects of pollution and climate on corrosion.", "Morandi himself was surprised to see the structure age faster than he had anticipated. In 1979 he issued a report detailing a number of interventions to protect the structure against pollution from nearby factories"

It's not a collapse on its first day, but it undermines your suggestion that bridges are a solved problem, designed once with traditional methods, are well understood inside and out, and therefore never collapse.

Or if that's too old, how about the celebratory Millennium Bridge in London, a footbridge opened in June 2000 and closed two days later because it couldn't withstand the footsteps? https://en.wikipedia.org/wiki/Millennium_Bridge,_London#Open...

The Millennium Bridge was altered because the swaying was "alarming" but there's no particular reason so far as I know to think it would have caused structural failure, it's just not useful to have a pedestrian bridge that some fraction of pedestrians won't cross because they're scared, it isn't a theme park attraction.

I think a better analogy is the human analog of what this is - spying.

Not noticing a spy in your company, university, army headquarters for 15 months is hardly an indictment of one's counter-espionage - especially if as it seems they did very very little till Feb, and did it very carefully.

seems is working hard here.

Apart from what the sibling comment mentioned, it's not a great analogy because we have a very limited things that can go wrong with a bridge. This is knowledge shared between all engineers - you analyse the design for multiple known forces and you're done.

Compare this to the CI systems designed to take unknown code and run it in a safeish way. In a bridge analogy it would be something like "one of the screws turned out to be a remote controlled drilling device which hollowed out parts of steel without visible changes" - of course nobody would notice that for some time.

...it's not a great analogy because we have a very limited things that can go wrong with a bridge.

Security engineering is still evolving.

But it seems reasonable to say that if you engineer a thing that may be subjected to unlimited, unknown stress, you engineered it wrong.

These many large companies giving themselves a single point of failure by installing SolarWinds binary, not knowing the contents of home-phoning and so-forth is a terrible security engineering solution. Sure, maybe they "couldn't have done better under the circumstance" but someone allowed the circumstances to happen too.

> But it seems reasonable to say that if you engineer a thing that may be subjected to unlimited, unknown stress, you engineered it wrong.

A computer is that by design. Anyone using computers is using a system subjected to unknown and unlimited inputs, especially when connected to the internet. (But it can be exposed via employees if it isn't)

Any controls we have at the moment are "this will not happen (under the assumptions we're currently aware of)".

If those companies didn't run this monitoring solution, they would use a different one. Unless the system is perfectly covered with access policies (none are), the monitoring solution is a global access to enterprises. You can mitigate the impact, shard things, etc. but there are still many cross-cutting concerns which will (suddenly?) turn out to be a single point of failure.

The problem is that Security Engineering is still treated as if it's a separate thing that is evolving.

It's systems engineering, where security is a necessary feature that's being grafted-on after-the-fact.

I think there ought to be a bit of a difference between e.g. an engineer that customizes your local car dealerships WP instance and one that works on a semi-security-related product that can become a prime hacking target and is deployed across nearly all Fortune 500 companies.

I think it's fair to say that these problems shouldn't be blamed on individuals, but systemic failure. In my imagination, someone should have seen this at some point. The system should have had safeguards to ensure that, regardless of any individual's personal failures.

That said, I agree it's a little harsh, since there's no evidence that anyone else could/has done better in this incident.

Why shouldn't it be blamed on the person directly responsible? In this case, the VP of Security and/or the CEO?

I'm not sure who would be to blame; but in my case, my former employer is still digging themselves out. 2 years ago, I was shown their Solarwinds console, and I thought it had neat pretty graphs. They asked me to set it up for an additional production database, and I told them how risky this was. As usual; I'm deemed the paranoid one, and they said to do it anyway, and being the good worker I am, and NOT a security specialist, I did it. I left there 2 months later for a different job, and TBH - this was one of the many reasons I didn't want to work there anymore, and I stated as such in my exit interview.

Now at the end of the day, the IT director signed-off on our HIPAA certification. I'm not exactly sure which of the 800+ security controls Solar Winds violates off the top of my head. But for him to sign a document that says; "we comply with all of these controls" - when it wasn't true, makes him LEGALLY culpable. And believe it or not - penalties for negligence under HIPAA are fucking hardcore. I doubt they'll be charged though.

> ... since 9/4/19 before being detected on 12/12/20.

Somewhat tangentially, I can highly recommend ISO 8601 to make life easier for all us non-Americans.


I used to dislike the American format then I realized it's ordered in max(m) < max(d) < max(y) which made me like it more

, it or but how the hell does that make more usable likeable? Interesting

(For your convenience I've sorted that question by len(word))

The format OP used is how we naturally pronounce dates - “September 4th, 2019”. Nobody says “2019 September 4th”.

Which we do you mean?

In Australia and the UK (at least) the standard spoken form would be '4th of September, 2019'.

This aligns with traditional short-date formats in those countries (dd/mm/yy[yy]).

But that just emphasises why adoption of an international standard is needed.

"This is what "top-class" security buys you"

Having lived in SF for 15 years, don't leave things in your car. Just don't. If you do, people will break into it. If you don't people might break into it - but! much less of a target.

Same goes for systems.

Don't keep sensitive data. Don't keep your users private data. Delete your logs, delete your VMs, destroy everything you can.

Their VP of Security posted a blog post entitled "Does your vendor take security seriously?" one month before they announced the breach.

> that would constitute ~200 TB exfiltrated per customer or ~19 years of 1080p video.

Nitpick. That's a very strange metric you're trying to use. What codec/bitrate are you using to determine that amount of spaced used?

19 years = 599184000 seconds

200TB = 200 * 1000GB * 1000MB * 8bits = 1600000000 bits

roughly 375kbps for the bit rate

That's some crap video quality even for h.265. I for one would not keep a hard drive full of video that poorly encoded.

I'd have been more impressed with how many times the Library of Congress could be stored in that 200TB

Here is my math.

200TB = 200,000,000,000,000 B * 8 bits / B = 1,600,000,000,000,000 bits.

1,600,000,000,000,000 / 599,184,000 = 2,670,298 bits/s = 2.67 Mbps which I think is in the normal range, though maybe a little low. I think I originally computed from 20 MB/min which I just found by searching on the internet randomly.

Unfortunately, there does not appear to be a consistent number on the amount of data held by the Library of Congress, so how about Wikipedia instead? Wikipedia says Wikipedia is 18.9 GB compressed [1], so 200 TB / 18.9 GB = 10582 Wikipedias. Wikipedia says a Wikipedia would be ~2821 Encyclopedia Britannica volumes which are ~1000 pages each, so this would in total constitute ~30,000,000 volumes of Encyclopedia Britannica exfiltrated before discovery. Hardly a sizeable amount of data.

[1] https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia

You forgot KB and mixed up your numerator and denominator.

What are the insurance protections for a security company? As in, there's professional liability available, but what are the consequences for SolarWinds/FireEye (if any) for this kind of breach? Can their clients take any meaningful action?

This is the key excerpt, its quite shocking:

    Crowdstrike said Sunspot was written to be able to detect when it was installed on a SolarWinds developer system, and to lie in wait until specific Orion source code files were accessed by developers. This allowed the intruders to “replace source code files during the build process, before compilation,” Crowdstrike wrote.

    The attackers also included safeguards to prevent the backdoor code lines from appearing in Orion software build logs, and checks to ensure that such tampering wouldn’t cause build errors.

    “The design of SUNSPOT suggests [the malware] developers invested a lot of effort to ensure the code was properly inserted and remained undetected, and prioritized operational security to avoid revealing their presence in the build environment to SolarWinds developers,” CrowdStrike wrote.
So how do we guard against this type of attack? How do we know this hasn't already happened to some of us? What is the potential fallout from this hack, it seems quite significant.

This must be why the Japanese Intelligence agencies prefer paper over computer systems. The digitization of critical national security apparatus is the Archilles Heel that is being exploited successfully. One example is Japan's intelligence gathering capabilities in East Asia, especially China, which is bar none. Japan has a better linguistic understanding of the Chinese language (Kanji and all) but also interestingly much of PRC's massive public surveillance equipment like CCTV cameras are made in Japan.

Even if they hire Krebs, I believe that if its digital, it can be hacked given long enough time period and unlimited state level backing and head hunting essentially geniuses of their country to do their bidding. I wonder how Biden-Harris administration will respond, it is very clear who the state actor is here. I'm very nervous about the ramifications of this hack.

We're all screwed. Predicted long ago.

See _Reflections On Trusting Trust_ [1]

[1] https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...

> So how do we guard against this type of attack?

One big issue with a lot of security and enterprise ops tooling is that it doesn't follow good practice around, well, security. For example, security code analysis software with hard-coded passwords that you hook into your build tooling, or in this case, ops software that instructs you to disable Microsoft's security tools so they don't flag problems with the agent.

In a similar vein I've had BMC software want the highest levels of access to Oracle DBs to do simply monitoring, and so on and so forth.

The other observation I heard Bruce Schneier make at a hacker con is more profound, and probably going to take a lot longer for national security actors to accept is this: the norms need to change. There is no longer a clear separation between "our stuff" and "their stuff" the way that there was a few decades ago, when espionage was more on "your telco network" or "my telco network". As we've moved to pervasive connectivity it's no longer possible to say, "oh that backdoor will only be used by our side", or "that hack will only affect their SCADA systems" or whatever.

I think this is the best answer out of all the replies.

>So how do we guard against this type of attack?

Only allow restricted CI servers to build and deploy production code.

Whitelisted list of software and security monitoring for CI servers so malicious software is harder to install.

Whitelisted list of software that developer machines can run (ie: no arbitrary third party code). Restricted docker containers for all local testing. There's a reason various companies do not give developers admin access to their machines.

edit: This or related concepts (SDLC, audit logs reviewed weekly, security team sign offs on system changes, etc.) are in my experience pretty standard in large enterprise security reviews for vendors that were high risk. The issue is that everyone probably lies and doesn't actually do any of these things even if they say they do. The incentives are set up to certify vendors rather than fail them.

> So how do we guard against this type of attack?

You can't ever prevent it, but you can raise the attack cost/complexity and make detection much more likely.

Go for immutable infra and transient CI/CD systems. Provision the build nodes on demand, from trusted images, and wipe them out completely after just a few hours. The attacker will have to re-infect the nodes as they come online, and risk leaving more network traces. Anything they deploy on the systems goes away when the node goes away.

The attack against SolarWinds worked so well because the CI system was persistent and it was upgraded over time. For a reliable and secure environment, the correct amount of host configuration is zero: if you need to modify anything at all after a system has been launched, that's a bad smell. Don't upgrade. Provision a completely new system with the more recent versions.

This kind of architecture requires the attacker to compromise the CI image build and/or storage instead. (Or the upstream, directly.) It's an extra step for adversary, and a simpler point of control to the defender.

Recon. Compromise. Persist. Exfiltrate. -- As a defender you want to make every step as expensive and brittle as possible. You can't prevent a compromise, but you can make it less useful, and you can arrange things so that it must happen often and leave enough traces to increase the probability of getting caught.

Besides all of these steps I think it's important to consider every convenience, every dependency, every package you add as another attack vector. This is even more relevant considering the product SolarWinds sold.

Another thing that might have made this hard - if Solar Winds were distributed as source code and each client built it themselves, with their own options (though the old back-doored c-compiler "thought experiment" may not be as much of a thought experiment anymore).

Moreover, achieving the hack was likely costly given the effort and the benefit of the hack appeared once the Solar Winds binary was distributed. You can reduce the benefit of such a hack by not having information critical enterprises all running the same binary blob.

> You can reduce the benefit of such a hack by not having information critical enterprises all running the same binary blob.

That assumes that you actually inspect the source, right?

If Solarwinds was distributed as source to hundreds of companies, maybe many would not bother diffing the source from the previous version but it seems plausible that a few would look at these at the least, especially given you are talking corporations who follow deployment procedures.

The build process itself could spit out the diffs at the end, for example.

well, you still have a 'tragedy of the commons' situation if say 10,000 users are all counting on at least 1 user to inspect the source - but with no commercial incentive to do so, nobody does.

How does a user of open source ever know or measure exactly how well the source code has been scrutinized?

I'm getting this "open source doesn't guaranteed there's no problem" claim. And I'd agree open source doesn't guarantee there's no problem. But it seems clear that routes which have allowed malware to be pushed automatically are fundamental, guarantee problems and malware in source code is a potential problem with an obvious answer.

Part of me wonders if they'd have been better with GitHub cloud CI/CD in Actions, with immutable build infra (e.g. Ubuntu base images).

But, with the wild npm dependency community, where a small app can have 5K transitive dependencies, I feel we're going to be even more susceptible to these attacks going forward.

One of the big reasons why some shops refuse to use Javascript on server side because of its unknown attack surface that is NPM.

It's impossible to keep track of all of these libraries that can be written by anyone. Someone could hijack a github account and push malicious code to NPM.

This reminds me of the Ken Thompson hack: https://www.win.tue.nl/~aeb/linux/hh/thompson/trust.html

> So how do we guard against this type of attack? How do we know this hasn't already happened to some of us? What is the potential fallout from this hack, it seems quite significant.

Verified builds. That means deterministic builds (roughly, from a given git commit the same binaries should result no matter who compiles them. It requires compiler support and sometimes changes to the code) plus trusted build infrastructure.

To verify that you haven't been compromised do a verified build from two independent roots of trust and compare the resulting binaries. Add more roots of trust to reduce the probability that all of them are compromised.

Establishing a trusted root build environment is tricky because very little software has deterministic builds yet. Once they do it'll be much easier.

Here's my best shot at it:

Get a bunch of fresh openbsd machines. Don't network them together. Add some windows machines if you're planning to use VS.

Pick 3 or more C compilers. Grab the source, verify with pgp on a few machines using a few different clients. For each one, compile it as much as possible with the others. This won't be possible in whole due to some extensions only available in a particular compiler used in its source, but is the best we can do at this point. Build all your compilers with each of these stage-2 compilers. Repeat until you have N-choose-N stage-N compilers. At this point any deterministic builds by a particular compiler (gcc, llvm, VS) should exactly match despite the compilers themselves being compiled in different orders by different compilers. This partially addresses Ken Thompson's paper "reflections on trusting trust" by requiring any persistent compiler backdoors to be mutually compatible across many different compilers otherwise it'll be detected as mismatched output from some compiler build ancestries but not others. Now you have some trusted compiler binaries.

Git repository hashes can be the root of trust for remaining software. Using a few different github client implementations verify all the hashes match on the entire merkle tree. Build them with trusted compilers of your choice on multiple machines and verify the results match where possible.

At this point you should have compilers, kernels, and system libraries that are most likely true to the verified source code.

Make a couple build farms and keep them administratively separate. No common passwords, ssh keys, update servers, etc. Make sure builds on both farms match before trusting the binaries.

The good news is that most of this can be done by the open source community; if everyone starts sharing the hashes of their git trees before builds and the hashes of the resulting binaries we could start making a global consensus of what software can currently be built deterministically and out of those which are very likely to be true translations from source code.

EDIT: https://wiki.debian.org/ReproducibleBuilds is Debian's attempt at this.

> starts sharing the hashes of their git trees before builds

Yes, and it'd be equally fine to share the hashes after the build? (Just so I'm not misunderstand anything)

You're right, I phrased that a bit ambiguously.

To me it sounds like they hacked the editor/code signing tools to insert malicious code on save/commit by devs. Having iron-clad CI toolchains don't help you with that. Need to focus on how to defend the devs.

That's the point of a trusted build farm. Devs commit changes to git, and either request a build or the build farm polls for commits and builds the latest commit on trusted hardware+toolchain.

A malicious attack could change the code but it would be detectable because git would preserve the malicious parts in the repo, and further tie a specific malicious binary to a particular commit making it easy to find the malicious code itself.

As long as not all developers are compromised then whoever is doing the code review would see the malicious code when they pull the branch to review it.

> further tie a specific malicious binary to a particular commit

Git uses SHA1 for hashes, right? Aren't there demonstrations that SHA1 hashing is cracked, so you could craft a replacement commit that hashed to the same value, in theory.

The developers of git are working on moving git to use SHA2 and have already mitigated some of the concerns around using SHA1: https://git-scm.com/docs/hash-function-transition/

SHA1 hash collisions are hard, especially when the data you can inject needs to look like code to a human and compile correctly. But the concern is valid so it's good that git is improving in this way.

Based on the description, the source code wasn't modified, only the build tooling on the developer machines. As such a CI server wouldn't be impacted.

You have to segment and have monitoring tools monitored by people with a clue.

But that is very expensive to do. The average SaaS or software company does nothing.

> So how do we guard against this type of attack?

Looks like they compromised the editor. If so, then I imagine checking checksums for each component of the toolchain would work. Though if they compromised the filesystem or runtime then that would complicate things. But still, a hash tree or certificate of the OS and toolchain as part of CI would seem to be a good idea in 2021.

> So how do we guard against this type of attack?

Don’t give people running windows machines access to your source code/production

Especially if they are domain joined.

Just don’t. Inb4 “but Microsoft has most sophisticated security” crowd, every major hack of the last 15 years always starts with compromised windows box that allows attacker an outpost to move laterally.

Then you should go review major security events.

Attackers have been able to compromise major environments regardless of OS type. Windows having a larger market share is clearly going to have more compromises, but other OS based environments are just as prone to compromise. There also have been major incidents with standalone applications that are OS agnostic.

Heartbleed is a great example of an OS agnostic flaw that was used for major attacks. As noted earlier while the OS does play an issue, its more than likely just correlation rather than causation due to Microsoft having a dominant position in the market.

Reference: https://en.wikipedia.org/wiki/Heartbleed#Exploitation

Source of major leak/intrusion as a result of heartbleed? That was 2013 btw and I remember it clearly - since it was closed pretty quickly I don’t recall anything major coming out of it

Edit: i mean when it was 0day. Obviously if some idiot didn’t patch their servers it shouldn’t count

It's very possible there were several and nobody noticed them, or nobody attributed them correctly.

There are also the Spectre and Meltdown attacks which exploited Intel cpu pipeline bugs regardless of the OS running on them.

Spectre/meltdown imo is basically same category of security vulnerabilities - big monopoly cutting corners on security for commercial gain

It was not done by a bunch of amateurs that is for sure. Now everything points to Russia, but that is also the most obvious clues to leave as who would question it. However...we know the NSA wants to be in every system and this kind of operational security and evasion screams of the NSA to me.

Could the NSA just tell the rest of the government to pretend it was Russia?

It's not like any of us would know better.

> Now everything points to Russia

How do we know this is Russians? To my knowledge its very common practice to obfusticate origins before launching a campaign like this by washing through several different countries.

You could leave stuff like comments or references that would suggest it was the Russians, there's just no way of knowing, so I follow the fundamentals of political sabotage: whoever benefits most is the culprit. Who has the most to lose and gain here?

Yeah, no.

The various Russian APTs have tooling they prefer to use and are attributable to them. This is generally fairly stable because these are professionals who spend years learning specific toolchains, programming, and skills, and do not really change it up much, since they don't have to. Even if they're attributed, what is the world going to do? Toss a bomb into Russia?

And before you get started, yes, security professionals are aware that you can obfuscate that, but there are already techniques to defeat this second layer of obfuscation.

If multiple sources are saying this was probably Russia, they probably have a decent bit of proof.

Hmm I hadn't considered that but how do you find out what tools were being used to produce the payload source code? How can you be certain? Could another adversary simply use the same tooling or perhaps it is shared to an allied nation (enemy of my enemy is a friend) to do its bidding.

These people are incredibly smart. https://link.springer.com/article/10.1186/s42400-020-00048-4 https://www.blackhat.com/html/webcast/07072020-how-attackers...


> They highlight that not only malware samples and their specific properties (such as compiler settings, language settings, certain re-occurring patterns and the like) are useful, but also information available outside the actually attacked infrastructure, including data on the command & control infrastructure.

Yes you can obfuscate certain things, buts its hard to obfuscate EVERYTHING, and if you dig deep enough, you can make a decent effort finding the owner.

From reading the paper, it seems like it would be difficult for a private hacking group to manage but completely doable for someone like the NSA. They could outfit an entire team to work somewhere else for an extended period of time, making behavior profiles unreliable.

Why would the NSA bother with hacking American companies? The American security establishment is only one warrant or national security letter away from getting all the information they need from any of these companies.

The NSA has plenty of reasons it might want to infiltrate the DHS or other agencies.

Except the CIA and other actors have been known to impersonate the methods of other nation states, so attribution is never the smoking gun you're claiming it to be.

That was certainly my point...every nation state does their best to obfuscate their code and point somewhere else.

This is good in theory, until you yourself get owned by someone else, that now all of sudden knows exactly what you think is a tell-tale sign of some other actor.

From there on they can modify their payloads to look like they come from another toolchain.

As a citizen, I am shocked and appalled by this backdoor. As a software engineer, I can't help but marvel at the creativity and thoughtfulness put into the exploit.

The average engineer isn't an infosec expert and love automation so they found the weakest link in the chain: CI/CD

We're not talking expert-level infosec here. They should have firewalled their CI systems. That's pretty basic stuff.

Yes, but...

Having worked at SolarWinds they're especially susceptible to demands from sales and marketing. "Go faster, ignore tech best practices, etc." It's not unique, but their culture is not a dev-first, or security-first, culture to say the least. Many product managers answer to marketing first, and don't have earnest tech backgrounds that would let them know right from wrong past sales numbers. The culture changed significantly when they went public the first time; it went from a place where devs built good tools... to a place looking to buy products / competitor products so they could charge more for their services. Look at how long it took them to get into cloud tools -- great example of how marketing and sales missed the boat because they were only focused on things they had sold before and not focused on systemic changes to the industry -- because technologists weren't driving.

Anyway, like I've worked a lot places with better security built into the culture, better tech best practices built into the culture... that's all I'm trying to say. Knowing that attacks like this are out there... and it was just a matter of time before it happened, SolarWinds did next to nothing to avoid it happening to them.

> SolarWinds did next to nothing to avoid it happening to them

And how does one stop this from happening to them? Assume 'standard' security practices are in place, e.g. firewalls.

Clearly the attacker was able to get their payload to be executed by a developer in SolarWinds. This payload clearly didn't trigger antivirus software, and hid itself from the developer and the build systems. How does a typical software dev. team guard against this?

As others have mentioned, building on secured CI/CD infrastructure (+ ideally deterministic builds) would have helped here. But I'm not sure if thats good enough. Presumably if CI/CD was set up, some of the engineers would have had security credentials to configure any build servers anyway. And if the hackers could leverage that, they might have been able to just infect the build servers too.

As annoying as it is, I can't help but think we need to bring parts of ios / android's security model to our desktop machines. Its ridiculous that a single rogue npm module can access any file on my computer, including SSH credentials and (probably) secure cookies in my browser's trust store. Our desktop security model is obsolete, and should do more to protect us from threats like this.

When I worked there, about a decade ago, they were aiming for 40-50% year over year growth metrics. Anything that threatened that was drug out and shot.

If you were a dev who was like, "Hey, we need to test our stuff... do proper code reviews... not outsource quite so much..." you were not treated kindly. QA cycles were definitely rushed to hit launch dates; if Dev was delayed, rather than push the launch the QA window just got shorter. And the QA team was outsourced -- which further reduced their ability to give any push back.

I just remember it was the sort of place where -- again not uncommon -- but every password was like "$olarwinds" (making that up, but password reuse was rampant) and passwords were shared in plain text / written on whiteboards, etc. Also a lot of tool / account re-use, and a lot shared passwords there too -- everyone using the same Google Analytics account sort of thing, and huge walls up around getting more licenses for tools you needed so you'd have to share sign ins with other people. I remember a ticket to "sync passwords" across a bunch of test environments with production... just intentional bad practices left and right; convenience > security at every turn.

At times it felt a bit like a frat house. The sales guys had a big gong they'd ring, the CEO was a cheerleader who would run down the halls yelling at the top of his lungs, "Woo! Woo! Another sale!" type stuff. There was no security training in place... for anyone. No password managers. No security audits as far as I could tell. No 2FA as far as I could tell. Generally speaking any badge worked to open any door. And you'd be met by anger if you ever tried to flag anything like, "Hey, I think the QA team just closed all my tickets without actually testing them..."

Been a while, and like I said, not unique... but they were just sloppy. But back to the question: what can you do? There's not like one magic bullet here. But they need to make security a priority. Do whatever it takes to make it so everyone in the company stops and thinks for 30 seconds about if what they're doing is a good idea from a security standpoint or not.

Here's a good starter pack for you to help your company get their foot in the door for security best practices.


https://www.crowdstrike.com/blog/sunspot-malware-technical-a... is the key link with more technical analysis for those interested, including source code of the implant.

    "If the decryption of the parameters (target file path and replacement source code) is successful and if the MD5 checks pass, SUNSPOT proceeds with the replacement of the source file content. The original source file is copied with a .bk extension (e.g., InventoryManager.bk), to back up the original content. The backdoored source is written to the same filename, but with a .tmp extension (e.g., InventoryManager.tmp), before being moved using MoveFileEx to the original filename (InventoryManager.cs). After these steps, the source file backdoored with SUNBURST will then be compiled as part of the standard process."

This, combined with comments from other threads, makes me think that SolarWinds wasn't the real target. They were just means to some specific high-value ends. When low-value companies were using it, the remote control would even remove the backdoor to avoid some accidental discovery.

How, and how much, were the real targets affected?

SolarWinds was absolutely not the real target. The malware wouldn't even execute if the machine it was installed on was joined to a domain containing the string "solarwinds".

I'm surprised this hasn't caused the software industry to completely halt and rewrite/audit all third party libraries and dependencies. The entire software supply chain is highly trust-based, npm especially. Why aren't we seeing the start of a NIH dark age?

NPM is not trusted by major fortune enterprises. Many of the tech companies I've worked at banned using NPM in prod. And instead created their own NPM Clone with source code pinned into a private repo, which security audited.

Adding a single NPM library became a total PITA, as it linked to another 100 other NPM libraries, which in-tern linked to an additional 100+ other NPM libraries. So adding a single NPM library to the private repo, meant adding 100s to 1000s of other NPM libraries. (E.g. left-pad [1])

Personally, I think this was a major reason for node.js not picking up more in the enterprise space. With Python & golang getting more traction.

[1] https://qz.com/646467/how-one-programmer-broke-the-internet-...

Python and go have similar issues though. People do not audit and do no vendor their dependencies.

At least python and go packages don't execute arbitrary code on developer's workstations at install time. If you pin dependencies to a known safe version, then you're relatively safe.

With NPM, just typing `npm update` can pwn your workstation.

What is the fundamental difference between pip and npm here? `pip install` will run whatever is in the `setup.py` of the package (and recursively do the same with any dependencies). That is essentially arbitrary code execution, no?

This feels more like a numbers issue than a fundamental difference, i.e. Python packages generally have orders of magnitude fewer dependencies than JS ones because the standard library is very extensive. Most dependencies are major libraries rather than small ones, which are much less likely to be compromised.

For Python this is only true for wheels packages. Source packages may execute arbitrary code in setup.py during pip install (compiling C code for example).

I wonder how many enterprises can afford that? I know of one that had a security team that was supposed to approve all dependencies, but it became a joke because it took them forever to approve a package or update, so patches weren't approved in a timely manner. The security workload had the reverse effect of making many builds less secure.

There does seem to be a market opportunity though, for a curated, "blessed" clone of NPM for enterprises to pull from.

> With Python & golang getting more traction.

Does pip not have the same issues?

I feel that the information about the breach didn’t reach a mainstream audience, because of the other things happening (such as the US election and all the chaos generated). In the current climate it’s difficult to pass the message that a massive cyber attack happened and have people take you seriously.

I could see that. Did it have a "cute" name? I think heartbleed and shellshock had more airtime but were less damaging IMO.

How many trillions of dollars do you think that would cost? Where's that money going to come from?

I am 99% percent sure the hackers are among the CUSTOMERS of Solar Winds.

That way they were able to live-test infected SolarWinds distro in their own controlled environment and develop all possible mitigations and techniques - the sheer amount of these evading techniques suggests they were built up over time, and not instantly.

Being Solar Winds customer and receiving infected updated versions every time gave them opportunity to perfect their techniques and hide for so long

At least that what I would do if I were a hacker and wanted to persist and be very careful about not getting detected

At least it would be safe to assume that they had access to several systems which received Orion updates. Given the attacker's dedication to the whole process, I'd say someone else's servers probably ran the tests for them.

... or they hacked a (low-value, low-security, easy) SolarWinds customer first.

Then they hacked SolarWinds. Then they used SolarWinds to hack the real high-value target(s).

Alternatively, it is not unimaginable for a foreign entity to bribe or otherwise compromise a SolarWinds employee.

it is much easier for highly skilled hacker to get fake identity and get employed at SW than to compromise an employee - although this is longer term play

> get fake identity

Wouldn't Google, Apple etc do background checks that discovered fake identities? Or is it not so easy with background checks?

Maybe SolarWinds would be less careful though?

nation state spies can get fake identities issued legally by their own government and will be confirmed as usual during the background check

Ok. That's interesting.

I incorrectly assumed they'd try to get a fake US passport/identity.

The scope of this thing is staggering to me.

It must have taken a significant amount of time in prep and dev and then deploy and control.

A serious investment in time and resources so money.

But why?

In order to burn so many nice tricks they had to be after something quite valuable in one way or another.

What was the motherload they were after and did they get it? Will we ever know?

Or was this harvesting of the intelligence and information there were needed for the real gold?

They went wide, which might have been to obscure the real target, or they needed a lot of pieces from different sources

Similar attacks are going on right now, you just won't read about it until the year 202x, if ever. What are you doing about it?

I wonder is something as simple as two factor authentication for those who have access to the build servers may have helped prevent something like this attack?

At one end of the spectrum I need to make sure people aren't choosing dumb passwords and are applying software updates. On the other end, centralizing control makes a very juicy target for hackers.

From what we've seen until now, this company deserves everything that happened to it. Hope they go under. Ignoring security best practices for the chance of a quick buck.

What best practices were ignored?

From what I'm reading here I'm terrified as I don't think the industry can protect from a targeted attack like this. The only thing we have going for us is defence in depth to make a targeted attack expensive (like this attack was), and that there are sophisticated attackers and very many software shops which means I have a low probability of being the next victim.

Do we know how the SUNSPOT malware got access to the build VMs?

"So how do we guard against this type of attack?"

Don't allow access to your 'secret' source code from the open Internet.

Or actually do that and let everyone verify and build their own binaries. This is preatty much how Linux distros work, the multiple third party distro packagers packaging upstream code make it preatty hard to sneak in mallware unnoticed and almost impossible to affect multiple distros at once.

In comparison proprietary software companies are a single point of failure where customers can't access the source and have no means to verify what ghe binaries they get actually contain.

Could and likely already has hit others

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact