If an attacker has just one day to root around and exfiltrate they can easily get valuable information. If they are given 8 months they have already gotten everything of value for months and are just waiting around for any new data to come in. Think how inadequate your systems must be to let an attacker sit around in your systems for 8 months, it is mind-boggling how unqualified their systems are for their problems. And this is not just an indictment of SolarWinds. Just in case anybody forgets, it was the top-flight security company FireEye who discovered this breach after realizing they themselves were breached. A "best of the best" security company took 8 months before realizing that they or any of their customers had been breached. This is what "top-class" security buys you.
I recall that many companies have switched from a perimeter model of defense where systems are secured from the outside to a "defense in depth" model where each system is secured on it's own (plus the perimeter).
Perhaps folks should think about tightening the in-depth model and avoiding the consumer model of constant updates from a zillion providers. Or perhaps a single lab could verify updates of the zillion providers rather than leaving them on their own.
Yes, this is the current bleeding edge of cloud security, known as “zero trust.” Amongst other things, it usually involves provisioning mTLS identities in a secured manner to each service, with connections restricted to a whitelisted set of identities.
I found Evan Gilman and Doug Barth’s “Zero Trust Networks: Building Secure Systems in Untrusted Networks”  a pretty helpful read in understanding what modern/next-gen cloud security looks like.
Some modern implementations of varying depth and scale include SPIRE , Tailscale , and BeyondCorp .
For example, quoting verbatim from the link above:
> [...] Modern OS's, routers, basic apps, etc aren't as secure as software designed in 1960's-1980's. People are defining secure as mitigates some specific things hackers are doing (they'll do something else) instead of properties the systems must maintain in all executions on all inputs. We have tools and development methods to do this but they're just not applied in general. Some still do, like INTEGRITY-178B and Muen Separation Kernel. [...]
There have always been lots of techniques that people claimed as magic bullets for security. They were there in the 1960s-80s, and there still are.
And yet Modern OS's, routers, basic apps are much more secure than the equivalents then.
I mean in the 1970s, there was the Ken Thompson backdoor and no one knew about it until he disclosed it in his ACM Awards acceptance speech.
Nowdays there are provably secure kernels in real world deployments, and high assurance kernels in billions more.
Edit: It seems like a serious extension of the zero trust concept would involve something like "only allow in source code into the system from sources we trust and then we compile it ourselves". Limit trust to trusted identities and don't allow binaries in any more than people.
No, "zero trust" means you install it (or anything) but assume that is is malicious so take proactive steps to limit its access within your network.
"Zero Trust" means assuming everything (including your compiler) might be compromised, so try to limit the damage any one incident can cause.
But you have to keep in mind - the whole point of zero-trust systems is that even if a malicious third-party blob is present on some nodes, it's blast radius is limited by the whitelist of clients that it can communicate with, which makes exfiltrating information particularly difficult unless you have additional exploits that let you move laterally.
And sure, there exist things like Linux drivers that are binary blobs in source code form but at least directs one's eyes to such things.
Edit: Also, even in these environments, some people and some software has to run the network itself, like a sys admin tool. Which makes sys admin tools something everyone should look at. And brings us to the present hack.
The problem is that network security is a tiny, tiny subset of security in general. You can throw SAML, mTLS, or IPsec at your connections all you like, it would have done exactly nothing to stop the type of attack that hit SolarWinds and their customers.
Let me reiterate this: ZTN would have achieved zero protection.
The problem in general is that there are conflicting interests between what the developers want to do to reduce their work effort and operational application security. Even large vendors are hopelessly bad at producing secure and securable software. The documented and recommended easy path is not the secure one. The products are incredibly hard to secure from the perspective of some poor ops person.
Some random examples:
- Most vendors document their firewall ports, which is great! But not their direction or what roles they apply to.
- Very few vendors provide machine-readable firewall definitions that can be imported into firewall systems automatically.
- Vendors like AWS and Azure that do publish JSON lists of their endpoint addresses and ports use unique, non-standard formats, so very little effort is saved for the end-users.
- Public cloud vendors especially end up "blending" multiple customers into pools of shared IP addresses for all of their PaaS or SaaS services. This is a disaster for security, because it makes it virtually impossible to put a WAF or a WAP in front of certain services.
- mTLS support is a shitshow. Manually uploaded certificates that expire and break everything annually is the norm. CRLs and OCSP are almost never checked. Many products support only a single certificate, and will even crash if talking to products that do support multiple certificates and are in the middle of a rollover. Alerts and visibility of these certificates is poor. Secure certificate storage (i.e.: in a TPM) works better on laptops for securing WiFi than it does for servers processing billions of dollars in transactions.
- The rise of Docker, NuGet, Cargo, and NPM are a security disaster. Incident after incident and warning after warning have been ignored. Why? Because these technologies are soooo very convenient for developers, and they're the ones making the decisions to use these package management platforms. The ops people are then left with the mess. Have you ever tried patching a vulnerable version of .NET Core in a Docker container from a vendor that doesn't even exist any more? Good luck with that.
- Similarly, the plug & play nature of the public cloud with the extensions, plugins, agents, and marketplaces isn't a disaster waiting to happen. It did happen. That's what this SolarWinds thing was all about.
- Pervasive telemetry is largely indistinguishable from data ex-filtration. Not only does it pollute the logs, but vendors are deliberately obfuscating the traffic to work around customers blocking it. Did you know that Microsoft used "eu.vortex-win.data.microsft.com"? Did you spot the deliberate typo? It took me a while to realise why telemetry traffic was still getting through despite firewall blocks on "*.microsoft.com"!
- Back doors in products for "support" are another security disaster waiting to happen. Several SAN array vendors for example include these as standard now.
I could go on and on.
I'm watching some of our larger customers struggle with security, and to be honest it feels hopeless. These orgs have 10K+ endpoints, 2K+ servers, running tens of thousands of distinct pieces of software from at least a few hundred, maybe a few thousand orgs.
For every step forward in security, there's two steps back for someone else's convenience.
The most secure work is the work that doesn’t get done.
The whole thing is a cluster fuck. I keep getting hired as a person to help retain talent, but enterprises like to put all their cards on Solarwinds. When they can blame Solarwinds, their managers are still safe. There is no compartmentalization of access. You don't need to have all departments globally be in the same LAN to begin with.
We can hire all the "security experts" in the world, but if software and architecture are fatally flawed from the getgo, you can check all the checkboxes, and scan and produce all the pretty reports you want.
The end result is still paper mache
I'm thinking of properties like "for any api input the reply is independent of $SECRET".
Formal verification is an extreme cost amplifier, and the cost is super-linear with the size of the code base. I don't think my estimate is unrealistic, depending on exactly what you would expect to have running before vi (I was assuming a complete OS including a kernel, device drivers, a C stdlib or equivalent, and a userspace similar to the GNU tools).
This ignores the secondary problem which is that if the attacks being deployed are 100x stronger than the defenses, how hard is it to develop an attack that is merely 2x stronger than the defenses. If we lazily extrapolate this linearly, that would be 1/50th the resources to develop an attack that still outmatches the defenders. How many people do you think were on the SolarWinds attack? 10, 100, 1000? Even at 1000 that means you would only need 20 fulltime people for a year to develop a credible attack against nearly every Fortune 500 company in the world and most of the US government. That should be a terrifying concept. Obviously, this is lazy extrapolation, but it is not so off as to be non-illustrative of this problem.
Given this immense difference between real problems and available solutions, the only reasonable assumption for companies and people is to assume that they are completely defenseless against such actors and likely even largely defenseless against even poorly-resourced attacks as demonstrated time and time again. It is imperative that they act accordingly and only make systems accessible where the harm of guaranteed breach is less than the benefit of making those systems accessible.
But yeah, security is in a terrible spot, throughout the software industry.
It is not clear to me what the steps would be to improve it. It is not clear to me whether these steps would increase software costs (something that is quantitative) by "1,000% - 10,000%", and if so how society as a whole could possibly afford it, in this society where we have put (insecure) software everywhere.
We then need to determine what we deem adequate for intrusion detection. I contend that a competent intrusion extracts the vast majority of its value within 1 week. This is consistent with the results in the Verizon Data Breach Report from 2019 . Given this, we would logically want to detect an intrusion in less than this time otherwise the vast majority of the damage has already been done. By this metric, 1 week or 7 days versus 240 days would require a 35x or 3,500% improvement in detection by FireEye to be able to detect the intrusion and materially impact the value gained from the intrusion.
In addition, this is not materially different than the standard in the industry as seen in the other graphs on that same page which shows that detection normally takes on the order of months which is far longer than the standard for exfiltration which is days to weeks.
I do not contend that all software everywhere must be made secure against all attackers. However, it must be evaluated against actual expected attacks with a proper understanding as to what it can actually defend against otherwise it is impossible to make an accurate cost-benefit analysis. It is up to the creator/user/people to make an evaluation if the benefits outweigh the risks. That is how you define adequacy.
 https://www.nist.gov/system/files/documents/2019/10/16/1-2-d... Page 10
Like sure, securing yourself against a government is hard. But when a security researcher told solarwinds an update server was accessible with the password 'solarwinds123' and they sat on that report for 3 days , my confidence levels that they were generally competent are not high.
> Security researcher Vinoth Kumar told Reuters that, last year, he alerted the company that anyone could access SolarWinds’ update server by using the password “solarwinds123” 
> When reached for comment by Newsweek, Kumar forwarded his email correspondence with SolarWinds. He first notified the company of the issue on November 19, 2019. SolarWinds' information security team responded a few days later on November 22, 2019. 
> Others - including Kyle Hanslovan, the cofounder of Maryland-based cybersecurity company Huntress - noticed that, days after SolarWinds realized their software had been compromised, the malicious updates were still available for download. 
Did the company with the above security practices get more important stuff right? Like sure... maybe? But probably not.
Air-gapped networks are a pain to work with but are incredibly effective. I still haven't read anything on how the attackers actually reached the build server though, so it's hard to say if it would have worked in this specific case. I wonder if the lack of a clear statement of how they made the initial breach is due to
- bad searching on my part
- they still don't know
- they're still trying to mitigate it
- they're still trying to word it in a PR-approved way
One example: For many years before they were bought by HP, Compaq's mail system had a regular Internet-connected SMTP MTA that fed two back-to-back hardwired uucp MTAs that connected to another SMTP MTA that fed their internal network. This was the early days of the public Internet, and they didn't want to have to trust that their SMTP MTA was secure, so they just made sure nothing but mail could ever get through their mail pipeline.
It's awkward and a bit of a pain to set up and manage, but enterprise companies have the resources to do things like that. And it did actually provide state-of-the-art email security for its time.
(FWIW, I didn't work with Compaq, but compared notes with their network architect on an occasional basis...)
This definitely doesn't look good and there's probably many failures along the way, but jeez.
Now maybe nobody can do that, maybe these systems are the best available, but even if that is true that does not suddenly make them adequate. Adequacy is an objective evaluation of reaching a standard, if nobody can reach the standard, then nobody is adequate. We would not let somebody use a new untested material in a bridge if it can not support the bridge just because "nobody knows how to make a bridge that works with the new material". And, by any reasonable metric, an inability to recognize active infiltration for months indicates that against a threat actor similar to what attacked them, they are about as adequate as a piece of paper against a gun.
I think the people defending the engineers involved have a mistaken idea of what the responsibility of the security team is. Their job description is not "follow industry best practices" or "look for signs of a break in using their tools". Their job is to keep their company and customer's data secure. At this job they failed.
I probably would have failed too, so I have some sympathy for everyone involved. There's an open question of how we engineer our systems to make sure this never happens again. But none of that changes the fact on the ground that the security teams involved failed their responsibility to their businesses.
No. Their job is to use the resources they’ve been allocated to manage risk within the organization to the risk level senior management has agreed to accept.
Maybe that shouldn’t be their job, but it is. There isn’t a security team on the planet that gets to dictate security to the rest of the organization or unilaterally make decisions that influence the running of the business to achieve the level of risk mitigation that they themselves would prefer to attain.
That is putting aside the fact that defending against a determined nation-state adversary is a nigh-on impossible task that would require countermeasures like ‘don’t connect to the Internet at all’, ‘don’t hire anyone you haven’t personally known since childhood’, ‘hand-deliver your product to your customers’, and other equally impractical mitigations.
Nobody outside of Solarwinds knows if their security team succeeded or failed in the mission they were given.
Some people here on HN made the same argument when Equifax leaked personal information about millions of americans. They said it was ultimately management's fault and not the engineers' fault for not allocating enough resources to security. And the same argument was used by the engineers who made the Therac-25 radiology machine. In that case, software bugs resulted in a handful of deaths due to lethal radiation.
Upper management can't be responsible for everything that happens in their business. Engineering isn't their job or their expertise. Thats why they hire software engineers and security engineers - to be the local experts. We need to bear responsibility for the decisions we make in our field. And engineers have a duty not just to the companies we work for, but also to society at large. If we leave our personal judgement at the door in the morning, we fail in our duty to society.
To go back to the bridge metaphor, if a bridge falls down, its not good enough for the civil engineers involved to blame management for not giving them enough time / budget / whatever. They also bear some responsibility for the disaster. This has been enshrined in case law too, at the Nuremberg trials. "I was just doing my job" wasn't considered a good enough excuse for the guards in WW2 concentration camps. These are big examples, but I think the principle is fractally true.
And the inverse also holds. Praise and blame go together. The biomedical engineers in the labs also deserve praise for the covid19 vaccines they've invented, even if upper management told them to do it. We aren't management's slaves.
A more apt analogy, in my opinion, to the day to day realities of managing production applications and infrastructure is the regulation surrounding the maintenance of certified aircraft. There are minimum competency standards that are enforced by law, it is unlawful in almost all circumstances for a non-certified person to perform any maintenance or repair on a certified aircraft, and, crucially, an aircraft cannot return to service unless a certified mechanic signs off on the repair. Not the CEO of the company that owns the airplane, not some middle manager, only the expert (mechanic and, sometimes, inspector) can sign off on returning the plane to service.
Without that kind of legal cover, management can and will steamroll over anybody who is impeding their initiative of the day.
Do you think planes were falling out of the sky left and right before those air safety laws came into effect? No. The engineers at some companies pushed for sane, safe practices first. Later they were adopted by the industry and later still they were enshrined in law. Before those laws were passed, airlines still had a duty of care to their passengers, ethically and (I think) legally.
Likewise it’s up to us to decide what sane, secure software engineering looks like. Not politicians. Not management. It has to be us. Nobody else is qualified to make those choices. At some point those ideas might be codified in law; but we need to figure out what that looks like first. (And to be clear what you’re arguing for - imagine the reverse. Imagine if inventing security best practices was outsourced to politicians!)
The idea that management should feel free to steamroll over their own employees’ judgement for the sake of the initiative of the day is toxic. And that’s exactly the sort of work culture which creates global security issues like this one. Of course a balance has to be reached, but you don’t do anyone any favours by being management (and the law’s) highly paid keyboard.
That is exactly what was happening. In 1924, prior to the introduction of the first federal aircraft safety regulations in 1926, there was 1 fatality per 13,500 miles for commercial flights. Between 2000 and 2010, the average was 0.2 fatalities per 10 billion passenger miles.
Imagine yourself as an aeronautical engineer around that time. You have a sense of what good safety practices could look like - you’ve been to conferences and talked to your colleagues, and you have some thoughts yourself. But management at your airline doesn’t want to spend the money.
Would you argue for meekly going along with management’s choices, knowing those choices will kill people? I would say, if you did, you would have blood on your hands. We’re people first and employees second.
The stakes are lower and there’s a middle ground here. But you have a voice, and usually more power than you think. The siren song of dumping all responsibility for your actions onto upper management makes you into a victim and a child. It’s bad for society, usually bad for your company in the long term and bad for your psychological health and development. And a disaster for your professional development.
I don’t know if that lands with you, but it’s certainly a lesson I wish I could give to myself over a decade ago.
FWIW, there was an average of one steam boiler explosion EVERY WEEK in America (frequently with loss of life) when the ASME was founded to set standards for safe design and certification. So it can take considerable pain before efforts like take off. The FAA had the advantage of already having that kind of certification as an already established model, plus airlines were eager to have a stamp of safety approval.
It's hard to see how a "security certification" standard could really provide much assurance in today's world - witness the inadequacy of FIPS, SOC, the outright laughable HIPAA, etc. PCI is one of the only certs that really provides any kind of assurance, but it's driven by the banks that insist on it being there to protect themselves. And recent events have shown that we have way too much centralized control of electronic payments processing already...
Unethical data collection leads to regulation, which leads to less innovation in the long term. Fight for ethical behaviour in your company and team and we can, en masse, delay the need for that.
And as for regulation, if it were up to me I’d make EULAs mostly unenforceable. Which would give leave for the people and companies affected by security breaches like this to sue anyone and everyone responsible. Which, by the way, is how the law is designed and how it works in every other facet of life. Sell a faulty ladder that kills someone? Get sued for negligence.
You might be interested to read about the Morandi Bridge collapse in Italy: https://www.engineering.com/story/italys-morandi-bridge-coll...
"When the Morandi Bridge was built, encasing a cable in concrete was innovative.", "the decision to pre-stress it was debatable", the concrete meant they couldn't check for cable rusting underneath, the concrete could have been supplied by the Mafia and under specification, "“We have used materials that are destined to deteriorate quickly, like those of the bridge in Genoa,” Bercich said in a post-collapse analysis".
"Settimo Martinello, director of bridge inspection company 4EMME Service, told CNN that there are “about 15 to 20 bridges collapsing every year” in Italy."
From another source: https://www.theguardian.com/cities/2019/feb/26/what-caused-t...
"In the 1960s little was known about the interaction of materials, or the effects of pollution and climate on corrosion.", "Morandi himself was surprised to see the structure age faster than he had anticipated. In 1979 he issued a report detailing a number of interventions to protect the structure against pollution from nearby factories"
It's not a collapse on its first day, but it undermines your suggestion that bridges are a solved problem, designed once with traditional methods, are well understood inside and out, and therefore never collapse.
Or if that's too old, how about the celebratory Millennium Bridge in London, a footbridge opened in June 2000 and closed two days later because it couldn't withstand the footsteps? https://en.wikipedia.org/wiki/Millennium_Bridge,_London#Open...
Not noticing a spy in your company, university, army headquarters for 15 months is hardly an indictment of one's counter-espionage - especially if as it seems they did very very little till Feb, and did it very carefully.
Compare this to the CI systems designed to take unknown code and run it in a safeish way. In a bridge analogy it would be something like "one of the screws turned out to be a remote controlled drilling device which hollowed out parts of steel without visible changes" - of course nobody would notice that for some time.
Security engineering is still evolving.
But it seems reasonable to say that if you engineer a thing that may be subjected to unlimited, unknown stress, you engineered it wrong.
These many large companies giving themselves a single point of failure by installing SolarWinds binary, not knowing the contents of home-phoning and so-forth is a terrible security engineering solution. Sure, maybe they "couldn't have done better under the circumstance" but someone allowed the circumstances to happen too.
A computer is that by design. Anyone using computers is using a system subjected to unknown and unlimited inputs, especially when connected to the internet. (But it can be exposed via employees if it isn't)
Any controls we have at the moment are "this will not happen (under the assumptions we're currently aware of)".
If those companies didn't run this monitoring solution, they would use a different one. Unless the system is perfectly covered with access policies (none are), the monitoring solution is a global access to enterprises. You can mitigate the impact, shard things, etc. but there are still many cross-cutting concerns which will (suddenly?) turn out to be a single point of failure.
It's systems engineering, where security is a necessary feature that's being grafted-on after-the-fact.
That said, I agree it's a little harsh, since there's no evidence that anyone else could/has done better in this incident.
Now at the end of the day, the IT director signed-off on our HIPAA certification. I'm not exactly sure which of the 800+ security controls Solar Winds violates off the top of my head. But for him to sign a document that says; "we comply with all of these controls" - when it wasn't true, makes him LEGALLY culpable. And believe it or not - penalties for negligence under HIPAA are fucking hardcore. I doubt they'll be charged though.
Somewhat tangentially, I can highly recommend ISO 8601 to make life easier for all us non-Americans.
(For your convenience I've sorted that question by len(word))
In Australia and the UK (at least) the standard spoken form would be '4th of September, 2019'.
This aligns with traditional short-date formats in those countries (dd/mm/yy[yy]).
But that just emphasises why adoption of an international standard is needed.
Having lived in SF for 15 years, don't leave things in your car. Just don't. If you do, people will break into it. If you don't people might break into it - but! much less of a target.
Same goes for systems.
Don't keep sensitive data. Don't keep your users private data. Delete your logs, delete your VMs, destroy everything you can.
Nitpick. That's a very strange metric you're trying to use. What codec/bitrate are you using to determine that amount of spaced used?
19 years = 599184000 seconds
200TB = 200 * 1000GB * 1000MB * 8bits = 1600000000 bits
roughly 375kbps for the bit rate
That's some crap video quality even for h.265. I for one would not keep a hard drive full of video that poorly encoded.
I'd have been more impressed with how many times the Library of Congress could be stored in that 200TB
200TB = 200,000,000,000,000 B * 8 bits / B = 1,600,000,000,000,000 bits.
1,600,000,000,000,000 / 599,184,000 = 2,670,298 bits/s = 2.67 Mbps which I think is in the normal range, though maybe a little low. I think I originally computed from 20 MB/min which I just found by searching on the internet randomly.
Unfortunately, there does not appear to be a consistent number on the amount of data held by the Library of Congress, so how about Wikipedia instead? Wikipedia says Wikipedia is 18.9 GB compressed , so 200 TB / 18.9 GB = 10582 Wikipedias. Wikipedia says a Wikipedia would be ~2821 Encyclopedia Britannica volumes which are ~1000 pages each, so this would in total constitute ~30,000,000 volumes of Encyclopedia Britannica exfiltrated before discovery. Hardly a sizeable amount of data.
Crowdstrike said Sunspot was written to be able to detect when it was installed on a SolarWinds developer system, and to lie in wait until specific Orion source code files were accessed by developers. This allowed the intruders to “replace source code files during the build process, before compilation,” Crowdstrike wrote.
The attackers also included safeguards to prevent the backdoor code lines from appearing in Orion software build logs, and checks to ensure that such tampering wouldn’t cause build errors.
“The design of SUNSPOT suggests [the malware] developers invested a lot of effort to ensure the code was properly inserted and remained undetected, and prioritized operational security to avoid revealing their presence in the build environment to SolarWinds developers,” CrowdStrike wrote.
This must be why the Japanese Intelligence agencies prefer paper over computer systems. The digitization of critical national security apparatus is the Archilles Heel that is being exploited successfully. One example is Japan's intelligence gathering capabilities in East Asia, especially China, which is bar none. Japan has a better linguistic understanding of the Chinese language (Kanji and all) but also interestingly much of PRC's massive public surveillance equipment like CCTV cameras are made in Japan.
Even if they hire Krebs, I believe that if its digital, it can be hacked given long enough time period and unlimited state level backing and head hunting essentially geniuses of their country to do their bidding. I wonder how Biden-Harris administration will respond, it is very clear who the state actor is here. I'm very nervous about the ramifications of this hack.
See _Reflections On Trusting Trust_ 
One big issue with a lot of security and enterprise ops tooling is that it doesn't follow good practice around, well, security. For example, security code analysis software with hard-coded passwords that you hook into your build tooling, or in this case, ops software that instructs you to disable Microsoft's security tools so they don't flag problems with the agent.
In a similar vein I've had BMC software want the highest levels of access to Oracle DBs to do simply monitoring, and so on and so forth.
The other observation I heard Bruce Schneier make at a hacker con is more profound, and probably going to take a lot longer for national security actors to accept is this: the norms need to change. There is no longer a clear separation between "our stuff" and "their stuff" the way that there was a few decades ago, when espionage was more on "your telco network" or "my telco network". As we've moved to pervasive connectivity it's no longer possible to say, "oh that backdoor will only be used by our side", or "that hack will only affect their SCADA systems" or whatever.
Only allow restricted CI servers to build and deploy production code.
Whitelisted list of software and security monitoring for CI servers so malicious software is harder to install.
Whitelisted list of software that developer machines can run (ie: no arbitrary third party code). Restricted docker containers for all local testing. There's a reason various companies do not give developers admin access to their machines.
edit: This or related concepts (SDLC, audit logs reviewed weekly, security team sign offs on system changes, etc.) are in my experience pretty standard in large enterprise security reviews for vendors that were high risk. The issue is that everyone probably lies and doesn't actually do any of these things even if they say they do. The incentives are set up to certify vendors rather than fail them.
You can't ever prevent it, but you can raise the attack cost/complexity and make detection much more likely.
Go for immutable infra and transient CI/CD systems. Provision the build nodes on demand, from trusted images, and wipe them out completely after just a few hours. The attacker will have to re-infect the nodes as they come online, and risk leaving more network traces. Anything they deploy on the systems goes away when the node goes away.
The attack against SolarWinds worked so well because the CI system was persistent and it was upgraded over time. For a reliable and secure environment, the correct amount of host configuration is zero: if you need to modify anything at all after a system has been launched, that's a bad smell. Don't upgrade. Provision a completely new system with the more recent versions.
This kind of architecture requires the attacker to compromise the CI image build and/or storage instead. (Or the upstream, directly.) It's an extra step for adversary, and a simpler point of control to the defender.
Recon. Compromise. Persist. Exfiltrate. -- As a defender you want to make every step as expensive and brittle as possible. You can't prevent a compromise, but you can make it less useful, and you can arrange things so that it must happen often and leave enough traces to increase the probability of getting caught.
Moreover, achieving the hack was likely costly given the effort and the benefit of the hack appeared once the Solar Winds binary was distributed. You can reduce the benefit of such a hack by not having information critical enterprises all running the same binary blob.
That assumes that you actually inspect the source, right?
The build process itself could spit out the diffs at the end, for example.
How does a user of open source ever know or measure exactly how well the source code has been scrutinized?
But, with the wild npm dependency community, where a small app can have 5K transitive dependencies, I feel we're going to be even more susceptible to these attacks going forward.
It's impossible to keep track of all of these libraries that can be written by anyone. Someone could hijack a github account and push malicious code to NPM.
Verified builds. That means deterministic builds (roughly, from a given git commit the same binaries should result no matter who compiles them. It requires compiler support and sometimes changes to the code) plus trusted build infrastructure.
To verify that you haven't been compromised do a verified build from two independent roots of trust and compare the resulting binaries. Add more roots of trust to reduce the probability that all of them are compromised.
Establishing a trusted root build environment is tricky because very little software has deterministic builds yet. Once they do it'll be much easier.
Here's my best shot at it:
Get a bunch of fresh openbsd machines. Don't network them together. Add some windows machines if you're planning to use VS.
Pick 3 or more C compilers. Grab the source, verify with pgp on a few machines using a few different clients. For each one, compile it as much as possible with the others. This won't be possible in whole due to some extensions only available in a particular compiler used in its source, but is the best we can do at this point. Build all your compilers with each of these stage-2 compilers. Repeat until you have N-choose-N stage-N compilers. At this point any deterministic builds by a particular compiler (gcc, llvm, VS) should exactly match despite the compilers themselves being compiled in different orders by different compilers. This partially addresses Ken Thompson's paper "reflections on trusting trust" by requiring any persistent compiler backdoors to be mutually compatible across many different compilers otherwise it'll be detected as mismatched output from some compiler build ancestries but not others. Now you have some trusted compiler binaries.
Git repository hashes can be the root of trust for remaining software. Using a few different github client implementations verify all the hashes match on the entire merkle tree. Build them with trusted compilers of your choice on multiple machines and verify the results match where possible.
At this point you should have compilers, kernels, and system libraries that are most likely true to the verified source code.
Make a couple build farms and keep them administratively separate. No common passwords, ssh keys, update servers, etc. Make sure builds on both farms match before trusting the binaries.
The good news is that most of this can be done by the open source community; if everyone starts sharing the hashes of their git trees before builds and the hashes of the resulting binaries we could start making a global consensus of what software can currently be built deterministically and out of those which are very likely to be true translations from source code.
EDIT: https://wiki.debian.org/ReproducibleBuilds is Debian's attempt at this.
Yes, and it'd be equally fine to share the hashes after the build? (Just so I'm not misunderstand anything)
A malicious attack could change the code but it would be detectable because git would preserve the malicious parts in the repo, and further tie a specific malicious binary to a particular commit making it easy to find the malicious code itself.
As long as not all developers are compromised then whoever is doing the code review would see the malicious code when they pull the branch to review it.
Git uses SHA1 for hashes, right? Aren't there demonstrations that SHA1 hashing is cracked, so you could craft a replacement commit that hashed to the same value, in theory.
SHA1 hash collisions are hard, especially when the data you can inject needs to look like code to a human and compile correctly. But the concern is valid so it's good that git is improving in this way.
But that is very expensive to do. The average SaaS or software company does nothing.
Looks like they compromised the editor. If so, then I imagine checking checksums for each component of the toolchain would work. Though if they compromised the filesystem or runtime then that would complicate things. But still, a hash tree or certificate of the OS and toolchain as part of CI would seem to be a good idea in 2021.
Don’t give people running windows machines access to your source code/production
Attackers have been able to compromise major environments regardless of OS type. Windows having a larger market share is clearly going to have more compromises, but other OS based environments are just as prone to compromise. There also have been major incidents with standalone applications that are OS agnostic.
Heartbleed is a great example of an OS agnostic flaw that was used for major attacks. As noted earlier while the OS does play an issue, its more than likely just correlation rather than causation due to Microsoft having a dominant position in the market.
Edit: i mean when it was 0day. Obviously if some idiot didn’t patch their servers it shouldn’t count
There are also the Spectre and Meltdown attacks which exploited Intel cpu pipeline bugs regardless of the OS running on them.
It's not like any of us would know better.
How do we know this is Russians? To my knowledge its very common practice to obfusticate origins before launching a campaign like this by washing through several different countries.
You could leave stuff like comments or references that would suggest it was the Russians, there's just no way of knowing, so I follow the fundamentals of political sabotage: whoever benefits most is the culprit. Who has the most to lose and gain here?
The various Russian APTs have tooling they prefer to use and are attributable to them. This is generally fairly stable because these are professionals who spend years learning specific toolchains, programming, and skills, and do not really change it up much, since they don't have to. Even if they're attributed, what is the world going to do? Toss a bomb into Russia?
And before you get started, yes, security professionals are aware that you can obfuscate that, but there are already techniques to defeat this second layer of obfuscation.
If multiple sources are saying this was probably Russia, they probably have a decent bit of proof.
> They highlight that not only malware samples and their specific properties (such as compiler settings, language settings, certain re-occurring patterns and the like) are useful, but also information available outside the actually attacked infrastructure, including data on the command & control infrastructure.
Yes you can obfuscate certain things, buts its hard to obfuscate EVERYTHING, and if you dig deep enough, you can make a decent effort finding the owner.
From there on they can modify their payloads to look like they come from another toolchain.
Having worked at SolarWinds they're especially susceptible to demands from sales and marketing. "Go faster, ignore tech best practices, etc." It's not unique, but their culture is not a dev-first, or security-first, culture to say the least. Many product managers answer to marketing first, and don't have earnest tech backgrounds that would let them know right from wrong past sales numbers. The culture changed significantly when they went public the first time; it went from a place where devs built good tools... to a place looking to buy products / competitor products so they could charge more for their services. Look at how long it took them to get into cloud tools -- great example of how marketing and sales missed the boat because they were only focused on things they had sold before and not focused on systemic changes to the industry -- because technologists weren't driving.
Anyway, like I've worked a lot places with better security built into the culture, better tech best practices built into the culture... that's all I'm trying to say. Knowing that attacks like this are out there... and it was just a matter of time before it happened, SolarWinds did next to nothing to avoid it happening to them.
And how does one stop this from happening to them? Assume 'standard' security practices are in place, e.g. firewalls.
Clearly the attacker was able to get their payload to be executed by a developer in SolarWinds. This payload clearly didn't trigger antivirus software, and hid itself from the developer and the build systems. How does a typical software dev. team guard against this?
As annoying as it is, I can't help but think we need to bring parts of ios / android's security model to our desktop machines. Its ridiculous that a single rogue npm module can access any file on my computer, including SSH credentials and (probably) secure cookies in my browser's trust store. Our desktop security model is obsolete, and should do more to protect us from threats like this.
If you were a dev who was like, "Hey, we need to test our stuff... do proper code reviews... not outsource quite so much..." you were not treated kindly. QA cycles were definitely rushed to hit launch dates; if Dev was delayed, rather than push the launch the QA window just got shorter. And the QA team was outsourced -- which further reduced their ability to give any push back.
I just remember it was the sort of place where -- again not uncommon -- but every password was like "$olarwinds" (making that up, but password reuse was rampant) and passwords were shared in plain text / written on whiteboards, etc. Also a lot of tool / account re-use, and a lot shared passwords there too -- everyone using the same Google Analytics account sort of thing, and huge walls up around getting more licenses for tools you needed so you'd have to share sign ins with other people. I remember a ticket to "sync passwords" across a bunch of test environments with production... just intentional bad practices left and right; convenience > security at every turn.
At times it felt a bit like a frat house. The sales guys had a big gong they'd ring, the CEO was a cheerleader who would run down the halls yelling at the top of his lungs, "Woo! Woo! Another sale!" type stuff. There was no security training in place... for anyone. No password managers. No security audits as far as I could tell. No 2FA as far as I could tell. Generally speaking any badge worked to open any door. And you'd be met by anger if you ever tried to flag anything like, "Hey, I think the QA team just closed all my tickets without actually testing them..."
Been a while, and like I said, not unique... but they were just sloppy. But back to the question: what can you do? There's not like one magic bullet here. But they need to make security a priority. Do whatever it takes to make it so everyone in the company stops and thinks for 30 seconds about if what they're doing is a good idea from a security standpoint or not.
Here's a good starter pack for you to help your company get their foot in the door for security best practices.
"If the decryption of the parameters (target file path and replacement source code) is successful and if the MD5 checks pass, SUNSPOT proceeds with the replacement of the source file content. The original source file is copied with a .bk extension (e.g., InventoryManager.bk), to back up the original content. The backdoored source is written to the same filename, but with a .tmp extension (e.g., InventoryManager.tmp), before being moved using MoveFileEx to the original filename (InventoryManager.cs). After these steps, the source file backdoored with SUNBURST will then be compiled as part of the standard process."
How, and how much, were the real targets affected?
Adding a single NPM library became a total PITA, as it linked to another 100 other NPM libraries, which in-tern linked to an additional 100+ other NPM libraries. So adding a single NPM library to the private repo, meant adding 100s to 1000s of other NPM libraries. (E.g. left-pad )
Personally, I think this was a major reason for node.js not picking up more in the enterprise space. With Python & golang getting more traction.
With NPM, just typing `npm update` can pwn your workstation.
This feels more like a numbers issue than a fundamental difference, i.e. Python packages generally have orders of magnitude fewer dependencies than JS ones because the standard library is very extensive. Most dependencies are major libraries rather than small ones, which are much less likely to be compromised.
There does seem to be a market opportunity though, for a curated, "blessed" clone of NPM for enterprises to pull from.
Does pip not have the same issues?
That way they were able to live-test infected SolarWinds distro in their own controlled environment and develop all possible mitigations and techniques - the sheer amount of these evading techniques suggests they were built up over time, and not instantly.
Being Solar Winds customer and receiving infected updated versions every time gave them opportunity to perfect their techniques and hide for so long
At least that what I would do if I were a hacker and wanted to persist and be very careful about not getting detected
Then they hacked SolarWinds. Then they used SolarWinds to hack the real high-value target(s).
Wouldn't Google, Apple etc do background checks that discovered fake identities? Or is it not so easy with background checks?
Maybe SolarWinds would be less careful though?
I incorrectly assumed they'd try to get a fake US passport/identity.
It must have taken a significant amount of time
in prep and dev and then deploy and control.
A serious investment in time and resources so money.
In order to burn so many nice tricks they had to be after
something quite valuable in one way or another.
What was the motherload they were after and did they get it?
Will we ever know?
Or was this harvesting of the intelligence and information there were needed for the real gold?
They went wide, which might have been to obscure the real
target, or they needed a lot of pieces from different sources
From what I'm reading here I'm terrified as I don't think the industry can protect from a targeted attack like this. The only thing we have going for us is defence in depth to make a targeted attack expensive (like this attack was), and that there are sophisticated attackers and very many software shops which means I have a low probability of being the next victim.
Don't allow access to your 'secret' source code from the open Internet.
In comparison proprietary software companies are a single point of failure where customers can't access the source and have no means to verify what ghe binaries they get actually contain.