You'd like to think you can just restore the system from backup and it'll just light back up. But how do you test this without cratering your existing system? Like a boat in a basement, many system are built in-situ and can be very rigid.
Modern environments like cloud computing and creation scripts can mitigate this a bit organically, but how many of these systems are just a tower running Windows w/SQL Server and who knows what else? Plus whatever client software is on the client machines.
How do you test that in isolation?
At least read the media to see if it can be read (who doesn't love watching a backup tape fail halfway through the restore).
Simply, it takes a lot of engineering to make a system that can be reliably restored, much less on a frequent basis. And this is all engineering that doesn't impact the actual project -- getting features to users and empowering the business. Which can make the task even more difficult.
This modern attitude of compartmentalization, extremely localized expertise, and outsourcing everything to the cloud is going to be our downfall. Stop hiring muggles!
(My point is that incompetence doesn't make it morally ok that a criminal thing happens to someone)
2. There's a risk/reward component. Nobody likes to buy insurance. Resource constrained organizations will almost always choose to invest their resources to get MORE resources, not protect against the chance that something bad will happen. A rational organization should only invest in protection when the risk is so great that it's likely to interfere with its primary business (beyond their legal/moral obligation to protect information they're trusted with).
2a. If a $2m ransomware attack hits your organization every 5 years, and it would cost you $1m/year in talent & resources to harden against this, you SHOULD let it happen because it's cheaper. Just patch the vulnerability each time it happens and try to stretch the next ransomeware attack to more than 5 years away.
3. Of course there are many irrational organizations that don't protect against ransomware for irrational reasons (e.g. due to internal politics). There's not much to say here except that at some level of management (including the CEO & board) where people are not paying attention to what's happening, and they should go hire those hardcore nerds and pay them what they need to.
This is a good point but it points to something some might not like. Resource constrained warehouses might on skimp on covering the risk of fire, resource constrained restaurants might skimp on sanitation, Resource constrained power companies (PG&E) might skimp on line maintenance and let whole towns burn to the ground (Paradise, 80+ people, Berry Creek 30+ etc) and so-forth (up to every company being too "resource constrained" to pay to stop global warming). In cases of this sort, you have companies risking both their capital and the life and limb of average people.
We really have companies following this resource constrained logic and horrible things have and are happening. Economists describe this dynamic in terms of "externalities" and letting it run rampant pretty literally has in world on fire (and drowned under water, etc).
Sometimes I think as an industry we hyped too much and over delivered giving the impression that it was no big deal.
Spending a lot of money to fix something that is known will happen is easy to justify. If you don't do it, you will end up losing significant amounts of money.
For something that isn't guaranteed to happen, it's harder to justify putting a lot of money into it. What is the value of doing thorough testing of backups if they're never actually accessed? One could argue that it's purely a waste of money.
A counterargument would be just any of the frequent news articles about company X having to pay Y millions or lost such and such money.
We have a social contract where businesses just need to put forth some minimum effort (door locks, alarm) and police and the military do the rest.
We need to enforce this social contract online and against global criminals now. If the criminals are in a state that harbors them, use 10x retaliation to that state to give them an incentive to fix the problem. Dictators don't respect anything but force.
Attitudes about it are much better now than they have ever been, but there are still a lot of folks out there that don't prioritize it.
Guess what all of these have in common? The exact thing that security has; eliminating long-term risk. Don't blame the stock holders because your arguments didn't persuade management...
Those externalities are historically managed by regulation and laws so they are equal rules for every company : you are not theoretically in competition with companies that don’t protect themselves against fire because it’s mandatory to do so (in many countries).
Maybe we should start to make companies responsible for not somehow mitigating the risk of being attacked. It makes even more sense when customer privacy is concerned. I know GDPR started in that direction though.
But in practice you can be. Companies can get away with not following regulations for a long time. This isn't all too uncommon in the restaurant business.
Regulations need to be cheap and easy to follow. Otherwise they'll be skirted here and there. Yes, businesses will get in trouble for not following them, but if enforcement isn't on top of most of the violations then you'll just create a system where everyone ends up skirting the rules.
If security became a mandatory thing, single actors will of course be slow to invest, but the industry by itself will adapt by easing the adaptation to regulation.
For example, in fire safety, it’s now harder to build a non fireproof building because nobody will build that since building technics are now fully integrating this issue.
Your local web agency may not do a lot more effort to backup its data due to regulation, but you can imagine that its hosting provider would be able to provide a cheap and friendly backup solution thanks to this becoming mandatory.
It’s just my thoughts.
And in any case the threat surface is way different. If I hold up the local Best Buy at gunpoint I'm not walking out with their entire customer roll. But if I hack their POS, there's a pretty good chance that I am.
In places where there is sufficient law enforcement, this is not required but even then some extra-high-risk shops like banks still do. In places where there is not sufficient law enforcement, like failed states, favelas and the internet, almost all shops should hire guards (or their digital equivalent) or risk being robbed. In this particular case, the long term fix is to extend the rule of law onto the internet but until that time having good security is not optional.
No, that's just brain cancer banks board directors can opt for.
You simply skipped over customers data bying lost. Oh, insurance give you the money, court settlements will be payed, it's cheap ! For you.
And by not having that "talent & resources" you streach your own business beyound sanity. Like that "new telecoms" breed that do not own any cables or antennas. Just marketing, H&R and finances. And lacking any competence (becouse no infra, what they could possibly do ? :> ) call center. And then they outsorce their finance. And they go down or sell themselves when zefir blows...
Your thinking style promotes egg shell type of business. But it's cheap !
1. Yea if you don't lock your store you will get robbed and be held liable for the loss, your insurance won't pay out because you didn't sufficiently protect your liabilities.
2. Businesses buy insurance and secure their equipment because they are held liable for this, the cost is included in the cost of the product.
2a. If a ransomware attack hits your organisation it will destroy your reputation because customers will notice that you don't take their protection seriously, they will leave for a more expensive but reliable product that takes their business and clients seriously and actively works to protect them.
I get your arguments they just don't make much sense from the perspective of a business and its liabilities.
In many cases, you cannot really switch.
Ransomware just hit ticket machines on a UK railway. I do not think that this will make people who regularly take that railway seek for alternatives.
The question is - how do you know that it is going to hit you every 5 years?
This isn't a completely random event. If you build up a reputation of being a soft target, other hackers will try to dip their beaks, too. And there is a lot of them out there.
Paying even one Danegeld attracts more Vikings to your shore.
Brick and mortar stores don't need hardcore commandos. The recent surge in ransomware is actually a good thing as people slowly start to care about the obvious: that if they build their business on something that has a weak link, breaking this link will compromise their business, so it's their job to make sure it never happens.
This means asking awkward questions to people who are in charge of your IT infrastructure, whether in house or outsourced. What happens if this computer room is set on fire? What is our strategy of dealing with ransomware attacks? How long it will take to rebuild the systems after they are compromised? These are valid questions to ask as a business owner, and if you don't know the answer to them, you are to blame when the worst happens.
There is negligence, however. If you leave the door wide open and unsupervised and something gets stolen your insurance company will be much less understanding. That does not say anything about how theft is "morally okay", just that negligence is not okay.
Then take that and allocate a reasonable chunk of that growth (say 15%) to ensuring that can continue, through ongoing investment in IT. Their alternative is to abandon the internet, go back to paper, and dump their IT costs (and boost to business).
Unfortunately though, as long as there are bean counters trying to cut every outgoing and outsource every cost, and play accounting bingo to turn everything into monthly opex, it's unlikely to change.
Most companies need skilled technical people because the sheer aggregation of risk in a few outsourced providers (see Kaseya recently) shows that they won't be top priority when something hits the fan. If they want to be top priority, they need the people on payroll and on site. Not everyone needs a group of rockstar 10x-types, but we do definitely need better fundamental IT knowledge and ability to solve problems. And business needs to make clear this is what's needed in terms of job adverts and compensation - supply tends to rapidly learn to deliver what's valued... If you can convince bean counters to pay anything for it, that is...
Companies for whom data loss doesn't impact the public should be able to screw around however they want.
I'm not sure exactly who the officers of a company are (genuinely) but if we're talking about the decision makers - the board and CEOs and whoever - not the employees, then most of them aren't going to take risks that make them genuinely likely to go to jail. Creating that genuine risk is probably the only way of manipulating their behavior, especially when they haven't got any real skin in the game (e.g. a CEO paid a few million and some irrelevant stock holdings in the company that make their net worth high, but which are safe to lose).
The main issue is that such people are often well enough connected that they can spin a story that it's completely unreasonable to hold them responsible for their decisions. Personal responsibility is something for poor people. Someone will find a way to make the law that says legalese for "If personal data from a company is stolen because they chose to interpret the risks in way that unreasonably exaggerated the migitation costs and downplayed the restoration costs, or because they failed to consider the risks to the personal data they ought to have known they were collecting, then the CEO should go to prison for eight to eighteen months" mean "If the CEO personally steals personal data from a company and sells it to foreign agents, then they get twenty days in a luxury resort - but if they just use it to increase the cost of customers' insurance, then they get saddled with new high paying job, poor fellow".
The common message I hear about security is "it's not part of our core business". Safety was made (at least in some countries) to be part of your core business, as an unavoidable obligation. Nobody can use lack of information, capability, skill or awareness as a get-out for poor safety practices - you just have to do better. If we had the same with security, it might get the attentino of the board and its members.
I also think doing a disaster recovery exercise every few months is also highly valuable. You might think you know how everything works, and that you've covered everything, but remove permission from staging for everyone on your team and have them build it from scratch, and you'll figure out all the things that you forgot about that silently work in the background. (Last time we did this, we realized we didn't back up our sealed secret master keys -- they get auto-rotated out from under us. So we had to provision new passwords and recreate the sealed secrets to recover from a disaster. Now we back those up ;)
(A corollary: if you've had really good uptime in the last year, your customers probably think that you offer 100% uptime. But you don't, and they're going to be mad when you have an actual outage that the SLO covers. So it might be worth breaking them for 5 minutes a quarter or something so that they know that downtime is a possibility.)
One more point that I want to make... sometimes the cost isn't worth it. If you're a startup, you live or die by making the right product at the right time. If your attention is focused on continuously testing your backups, your time is taken away from that core challenge. While a hack where you don't have a backup is likely to kill your company, so is having a bad product. So, like everything, it's a tradeoff.
I don't want to hire the "hardcore nerds" that gatekeep expertise behind ridiculous rates and tribal knowledge, fwiw. I'd much rather pay for services, incrementally adoptable technology and clear roadmaps.
Seriously? Most of the Archlinux/Linux (where nerds congregate) automation and customization stuff is open source. Lots of people sharing their configs, and you can copy from them.
Also, why is it not okay for nerds to command high rates? Doctors do it. Lawyers do it. Politicians do it. And most of them suck at their job, by the way. So if a "hacker" could deliver on his promise, I think paying him a high white-collar rate is quite fair.
Yep! So tell me again why I should employ what the previous poster referred to as "hardcore nerds"?
> Also, why is it not okay for nerds to command high rates? Doctors do it. Lawyers do it. Politicians do it. And most of them suck at their job, by the way. So if a "hacker" could deliver on his promise, I think paying him a high white-collar rate is quite fair.
I'm responding almost entirely to the idea that "hardcore nerds" are the people to employ, and "muggles" are not. Paying for expertise is great! But really, as with all technology, I want to pay for people to make things easier for those around them. It strikes me as painfully obvious that, if you're crying out to employ "hardcore nerds" because "muggles" can't handle the work, you're also probably not the type to employ at all, either way. This is just my take, however.
At the other end, Fortune 500 companies have unique needs which requirer significant expertise. At that level trying to outsource most stuff is perfectly reasonable, but they still need internal talent to avoid being ripped off or suffering major disasters.
Pay people that have a skillset to do a job. But note that, for most orgs, the job _isn't_ "Do a job and be a dick about it", which is what we've been talking about. Being successful in a role means working with others for a common goal. That's literally what companies are (or, I should say, were meant to be).
Remember many extremely gifted athletes never make into a top collage team let alone the NBA etc. Similarly someone can be unusually intelligent and well educated, but that alone isn’t enough. Many people talk a good game, but being the best isn’t about having a large ego it’s about actually solving problems and improving things.
I don't care if you think I'm woke or not. What does that matter?
> In other trades these things are virtues. Growing up they were the name of the game in computing too.
Spoken like a person that has never even spent a moment of their life around a tradesperson. Find me a trade that doesn't have some form of legitimized apprenticeship. Find me a trade that doesn't have some form of certification process. Certified solutions to common problems. When necessary, permits, approvals, etc.
Software is nothing like that. "Hardcore nerds" build random unmaintainable workflows and then put other people in tough spots (note: not specific to "hardcore nerds").
I know a lot of folks who work every day in the trades and to a person (don't worry, 'tomc1985, this isn't me being woke, I'm acquainted with women in the trades too) the best folks I know have no stake in gatekeeping. They're excited to bring in new folks to the profession--granted, it helps that they're cheaper to start out, but the folks I've talked to know that eventually they'd like to retire and people are still gonna need drains snaked--and liberal with teaching what they know. My electrician walked me, as a "muggle", through my house's wiring until he was confident that I would be able to understand what my house was doing and explain it to somebody else.
The guy who acts like 'tomc1985 describes is the guy who does not get called back for a second job. In tech, we used to call them BOFHs and make fun of them while they thought we thought they were cool. Now we just don't hire them.
Yep, this is a very deeply ingrained part of trades. Apprenticeship is a legitimate portion of one's career, and as such people further along usually have an appreciation for apprentices and apprenticeship. I've never met a legitimate (meaning, person that was trained appropriately and continued to work in their profession) tradesperson that kept secrets.
> The guy who acts like 'tomc1985 describes is the guy who does not get called back for a second job. In tech, we used to call them BOFHs and make fun of them while they thought we thought they were cool. Now we just don't hire them.
Without getting hyper-political, I fear this is the same problem as the "incel" topic of late. Getting rejected and then doubling down on a persona, rather than introspecting on to how or why a portion of that rejection was warranted, I think causes huge huge problems. I think the "hardcore nerd" persona is similar, as "hardcore nerds" usually don't make it very far. If you go through FAANG companies, you won't find a ton of people that fit the bill. Maybe some kinda-weirdos, but the higher up you go the more comfortable people are going to be educating and communicating (except maybe Amazon). Do you think it's valuable, when shit's super important and hard to do, to condescend to those around you? To pad your ego? I don't really think so.
You aren't hiring the right ones!
And frankly I wish software engineering had a credible certification body and apprenticeship system. We are in dire need of one! At least then I could have a lot more confidence when whoever walks through the door looking for work.
Yes hardcore nerds make messes but they also build and run some of the most beautiful code that ever existed. Not to say that normal folks can't, but you can't paint hardcore nerds as all bad -- even with me as an example, because I have a bone to pick with tech and attitudes like yours and I'm not afraid to express it strongly.
So is your argument that it's not just "hardcore nerds" capable of successfully building the systems in this thread? Because that's been my entire point.
> And frankly I wish software engineering had a credible certification body and apprenticeship system. We are in dire need of one! At least then I could have a lot more confidence when whoever walks through the door looking for work.
> Yes hardcore nerds make messes but they also build and run some of the most beautiful code that ever existed.
The same can be true of those that would never identify as a "hardcore nerd". Almost as though the way one carries themselves is unrelated to their ability.
> Not to say that normal folks can't, but you can't paint hardcore nerds as all bad --
I paint "hardcore nerds" that condescend to people they call "muggles" as all bad, because 80% of getting things done is working with other people, and that behavior explicitly calls out an ability to work productively with others.
> even with me as an example, because I have a bone to pick with tech and attitudes like yours and I'm not afraid to express it strongly.
The persona you carry around is a great way to have all of your points dismissed indiscriminately. Expressing yourself strongly is great! Being an asshole, less so.
My point is that we are not closed to outsiders. Anyone can become a 'hardcore nerd', but the essence of meritocracy is merit, but there just aren't a lot of shortcuts there. If you're willing to put in the time to learn the mastery so you can step with the elite, then welcome. Otherwise, GTFO and stop trying to take our jobs.
'Elitism', 'gatekeeping'... all that, just sounds like sour grapes to me. Those folks can come back and try again after they've leveled up more. Otherwise, they can go back to whatever else it is they do.
> I paint "hardcore nerds" that condescend to people they call "muggles" as all bad, because 80% of getting things done is working with other people, and that behavior explicitly calls out an ability to work productively with others.
And I paint people that look down at nerds as bad. And "getting things done" used to require a hell of a lot less interpersonal action, but nowadays skillsets and business seem to trend towards codependency, not independence. Part of my issues with tech today.
> The persona you carry around is a great way to have all of your points dismissed indiscriminately. Expressing yourself strongly is great! Being an asshole, less so.
I swear to god I am sick and tired of you folks talking down to me about this stuff like you're better. You are not the first and certainly not the last. Allow me to be an elitist asshole, my message has clearly resonated with you (after all you did say you were walking away from this thread like 3 or 4 replies ago) so I'm not entirely sure what point you are getting at! Bad press is still press, as it were.
It's a shame that society has turned so hostile to the specialized operators of the world. Do you think the same thing of the Marines, who straight-up advertise that they won't allow just anybody? What about doctors? Hell if I was a plumber, or an electrician, or an auto repair guy I would want to make damn sure that someone else trying to enter my space and potentially compete with me is at the very least competent. And newcomers are great! Up until they try and reshape the world to be easier for them and worse for the incumbents.
The way I see it, the nerds built this shit and tech is our house. No matter how hard everyone else is trying to muscle in on it because tech is what's hot, we were here first and this is our territory. Not my fault every business idiot and their mother is throwing money at us because what we've built is so much better. Everyone is free to run a business however they see fit, but if you want to do it in or with tech you gotta pay the fee. Or not, if you're willing to build the mastery to work around that (but then guess what, now you're a nerd too!) It's the same way everything else works in this cursed world we're stuck in.
I stopped responding to a different thread in which your responses were less subtle trolling and more obvious, surface level trolling.
> I swear to god I am sick and tired of you folks talking down to me about this stuff like you're better. You are not the first and certainly not the last.
Intentionally "having an attitude" is met pretty poorly, pretty frequently I'd bet. Sounds like a delivery problem.
> The way I see it, the nerds built this shit and tech is our house.
Eh, not really? FAANG dominates tech now. FAANG employs most of the best technical minds outside of prestigious universities, like it or not.
"Look down on nerds"? Are we in high school? Nobody cares, dude.
> nowadays skillsets and business seem to trend towards codependency, not independence
They always did and always have, for thousands of years. It's not new.
> I swear to god I am sick and tired of you folks talking down to me about this stuff like you're better.
Not tearing off on a spittle-flecked, unhinged rant because somebody says that calling people "muggles" is unproductive and career-limiting and just kinda not a good way to operate would be a good start towards being better.
People will give back what you put in.
edit it wasn’t really feedback. Closer to a complaint.
You employ who can get the job done. I don't really care what they are called.
I answered you on the "gatekeeping" part. If there is something I like about this community (IT) is that there is way less gatekeeping than any other profession.
Because the referred "hardcore nerds" are only ones who actually bother to figure out what piece of config to copy where?
 most of what most people think is hardcore nerdiness is just this: https://xkcd.com/627/
This has proven, over and over, to make things worse. Important levers get removed because Joe Sixpack can't look at more than a label and a few buttons without freaking out. I'm sick of watching wonderful technology decay to uselessness or annoyance because some rent-seeking wanterpreneur thought he could build a business off expanding the audience for everything
That was the whole point of the previous post. It is already cheap and easy and there is no gatekeeping, ridiculous rates or tribal knowledge. Just hire anyone mildly knowledgeable and do more than nothing. In 2021 there is no excuse anymore.
All your vitriol is just based on your projections of your misunderstanding of the phrase "hardcore nerd".
I guess you never worked with lawyers (the top ones) or surgeons.
Easy to whine online, harder to raise money, hire the "hardcore nerds" and ship something!
Sorry, all those people had to get jobs at AWS, Digital Ocean, Heroku, etc. years ago if they didn't want to be Puppet jockeys for the next mumble years. Frankly I wouldn't be surprised if the shared-hosting "cloud" companies didn't actively push DevOps as a way of reconfiguring their hiring landscape.
Put a cost to this then perhaps your argument will be believable, or you will see it's not a viable option.
This is what companies like Accenture try to do. They are not cheap and I'm not sure if they have withstood a ransomeware attack.
Says a million finance people that prefer armies of compliant muggles over a few self-important wizards.
It's difficult to do, yes. But it's difficult in a "dig a tunnel through that mountain" way, not in a "solve this mysterious math problem" way. It certainly could be done. It would just take time and money (money including hiring people).
People constantly point to the difficulty of backups and the difficulty of hardware level separations between systems. But these are merely difficult and costly. "No being hacked" and "writing always secure code" are impossible and so they won't protect from backup failure.
And yes, companies would rather spend money on other things. That's what it come down to.
I think OP meant to compare a known hard task vs an unknown hard task. Both are hard, but first is known to be possible to finish, but the other not so.
A widely accepted method fur judging at least subjective risk when there are insufficient facts for truly objective measure is to ask how much you would bet on various outcomes - such as whether P=NP is solved before New York's 2nd. Ave. subway is completed.
So, in a sense I think that part captures the problem. People object to testing backups using the logic of "with our shitty processes, it will be dangerous". I suppose you have companies with a logic akin to "due to being cheap arrogant, we have shitty processes" -> "due to shitty processes, our processes are fragile" -> "due to being fragile, we avoid anything that would stress our system" -> "due our system not being able to take stress, we just pay ransomware instead of stressing system with a backup".
It seems like we've reached to "throw-away enterprise" level. Build it cheap until it breaks, then walk away when the fix cost is too high. There's a cost-benefit to this. Reminds me of bandcamp and other declining sites that just vanished one day with information some would consider valuable.
I’ve worked with 9-figure turnover entities broken by this sort of thing and the first recommendation is always fire or manage out the CIO, risk/audit officer and/or CFO.
Everyone cries about having no money. What is lacking is an ability to identify and manage risk that puts the existence of the company as a going concern at risk.
It's also one of the problems the cloud is great at causing. It's certainly possible to architect maintainable (and repeatable) infrastructure in the cloud. But it's just as easy (or easier) to deploy a mess of unorganized VM's that were launched and configured by whoever needed one.
That said, in this way, maybe a periodic "DR" that actually replaces the current operations would be a helpful...well, not test, but..."resilience practice" maybe? It could be a new twist on continuous deployment: continuous recovery.
You can't fix already broken processes. VMware solved this 20 years ago. It is pretty simple to restore VMs on different systems, you don't need to worry about the ball of mud when you can duplicate it.
Copy AMI to separate AWS account, not in your Org, and keep keys to that account offline.
.... software/IT just hasn't had people willing to do that.
Some organizations do, some don't. Long time valuation seems to agree with the former, short term the latter!
Maybe we should get into the habit of going all Shiva/Brahma on that environment every week or month. Burn it to the ground, and recreate it with an automated process. Sort of like a meta CI test.
No it isn’t easy, but it’s also not an impossible task.
Is it easier to test after the ransomware attack?
To do a solid test, you need to restore the system. It's difficult to restore a running system because, well, it's running.
That typically means a parallel environment. A single box represents a bunch of "magic values" that are stuffed in some config somewhere. Imagine several of those. "We need to restore SQL Server, the AD Server, the Application Server..." Reloading on a new environment is an easy way to find out those magic numbers, typically the hard way. Restoring on your existing hardware, with existing networks, existing IPs, etc. you're laying your software and configs over a known, working environment.
How do you test a recovery of licensed software thats bound to the machine that you have running in production, for example? "Oh, sorry, you have a duplicate license detected" and it shuts down your production system for a license violation. "Sorry, we detect the wrong number of cores", or whatever other horrors modern licensing systems make you jump through. You DO have another dongle, right? (Do they still use dongles?)
It can be far easier to do an image restore to your already working system than trying to load it up on something else. Since your production box is horked anyway, an image restore should "just work". But testing it, that's another story completely.
If you're not testing your backups, then they're broken with close to 100% certainty.
Before the attack, all machines are active and being used. If you screw one of them up that happens to be running something obscure but vital, the screwup is your fault.
After the attack, all machines are dead so the screwup is blamed on somebody else. If you have the email/memo (you did print it out and file it, did you not?) showing that you informed the CTO/CIO, blame for the screwup will get buried.
Bare metal backups are a good place to start if you have a complicated system.
A lot of environments aren't as complicated, and you COULD just pull your hard drive out, put in a spare, and test how your backup works. At least, test it while airgapped to see if it comes up at all.
And honestly, why not double up your entire server fleet for a temporary build-from-scratch rehearsal? Many shops could quintuple their infrastructural costs and still sit far above being in red. Most software enterprises in the modern day don't reap economic value as a function of how well they can convert hardware resources, but human resources. Optimizing on infra costs is not a main priority for any shop I've seen. I imagine not even for an IaaS provider these days.
Months later, we had a power outage (scheduled I think, I don't recall).
Anyway, at some point during the transition and such, I managed to have the machines hard mount NFS across each other.
As long as one of the machines is up, everything is rosy. But cold start? They were both hanging while restoring the mount (which, they couldn't because neither was "up").
Took us about 1/2 hr to suss out what was happening and get single user in to tweak it, but...yea...exciting!
"Smart" people do this kind of innocent stuff all the time.
That process works well till ransomware comes in and destroys every server and client machine at once, and suddenly you've just given the IT department multiple years worth of work to all do at an emergency pace.
This is a non-issue with Git + Ansible, even without getting into tools like Terraform. At my dayjob, i set it up so that there's an Ansible playbook that does around 200 administrative tasks for each of the servers for a particular environment - all of the configuration is within Git repositories. Changes are applied by CI, all of the process also being thoroughly documented for local development, if necessary.
Everything from installing packages, creating or removing user accounts, setting up firewall rules, setting up directories and sending the necessary configuration, systemd services for legacy stuff, container clusters, container deployments, monitoring, container registries, analytics, APM and so on are handled this way.
Noone has write access on the server itself (unless explicitly given in the playbook, or the admins) and even the servers themselves can be wiped and reinstalled with a newer OS version (mostly thanks to Docker, in which the apps reside in), plus all of the changes are auditable, since they coincide with the Git repo history.
It took me maybe 2 weeks to get that up and running, another 2 to handle all of the containerization and utility software aspects. I'm not even that knowledgeable or well paid, there's very few excuses for running setups that don't let you do things "the right way" in a somewhat easy manner, like Windows Server. That's like picking a hammer when you need to screw in a screw - it'll do the job but chances are that there will be serious drawbacks.
30 years ago mainframe companies started realizing that their mainframes couldn't restart anymore - after many years of uptime all the on the fly configuration changes wouldn't be reapplied and so the whole couldn't restart. (all hardware had redundancies and backup power supplies, so any individual component could be replaced and most had been over time) So they started scheduling twice a year restarts to test that the whole could come back up. The mainframe itself is fully able to run for years without the restart, but the configuration wasn't.
The primary problem you are protecting against is that backups are broken, or corrupted. A broken, as in failure to write, backup won't restore. A corrupted backup probably wouldn't restore either, even if it did it would fail the MD5 checks. No one is expecting you to destroy and repair Prod on a weekly basis.
However for a company of our size, it's not really possible. I was talking to my lead about this last year and we have about 50TB of our entire system I believe stored in our databases. All he said was "we have one, but hopefully I'm retired before needing to find out how to restore from it."
Honestly I'm not sure many large IT depts have it either, but for $1M+ the attackers can afford good customer service.
Keep offline, airgapped backups.
Lots of ransomware tends to be "delayed" – it can't encrypt everything instantly, so it encrypts a little bit each minute. During those minutes, isn't it conceivable that the now-encrypted files replace your old backups?
I suppose this isn't really a "backup" but rather a "mirroring strategy." But for certain kinds of data -- photos, video, large media -- can you really afford to do any other kind of strategy?
The other question I have is related to that first point: since ransomware can't encrypt everything all at once, how is it possible for your system to continue to function? As you can tell, I'm a ransomware noob, but it's quite interesting to me from an engineering standpoint. Does the system get into a "half encrypted" state where, if you rebooted it, it would fail to boot at all? Or does ransomware targeted at businesses tend to be more of a surgical strike, where it targets and wipes out specific datastores before anyone notices?
(It's the "before anyone notices" part that I'm especially curious about. Isn't there some kind of alarm that could be raised more or less instantly, because something detects unexpected binary blobs being created on your disk?)
1. Use a COW (copy-on-write) filesystem like btrfs or ZFS
2. Set up snapshots to be taken periodically (hourly/daily) and sent to a different host or volume.
3. Monitor disk usage: if you get hit by a cryptolocker, your disk usage will approximately double as it rewrites all your files.
4. Manually backup snapshots or the full volume to offline storage every N days/weeks/months.
In case you missed it, I wrote this up a while back: https://photostructure.com/faq/how-do-i-safely-store-files/
TL;DR: Lots of copies keeps stuff safe!
My homelab, admittedly low complexity, uses a NAS device that powers on at a scheduled time, mounts the shares it needs to, runs the 'pull' backup, unmounts, and powers itself off.
The intention being that, in the event of intrusion, it's presence and accessibility are limited to the window of time in which it's performing the backup.
Additionally to that, a rotating set of removable HDDs as backups of backups that also get spread off-site amongst family members houses.
I really should go into offering backup solutions to local small business...
I think all of that is lost if the cryptolocker just formats any volumes named Backup.
The copy-on-write can’t just be enforced by the filesystem. If this computer can permanently delete content on the backup system, then so can the locker.
Does btrfs or ZFS have a way to pull snapshots in a way that they are encrypted on the client side ?
Ideally you could hire a third party to pull these backups from you, have them warn you when the process fails or doubles in size (the data
is being encrypted) and still be able to prove that there's no way they can access the data. And then the private master key(s) go into a safe.
> But for certain kinds of data -- photos, video, large media -- can you really afford to do any other kind of strategy?
Yes. Make the backup system for that big slow-changing data a moderate amount bigger than the primary data store, and then you can have months of retention at low cost.
If too much data changes at once then it should go read-only and send out a barrage of alerts.
For home, I have two USB disks I use for backups and I alternate which I use. Neither is plugged in at the same time. At least one is always "cold".
For larger scale, you can do the same thing with tape. One tape backup goes off-site (perhaps) or at least cold.
The cost isn't that high. A USB spinning disk may cost a third to a 5th that of your SSD hard drive. And you can get hard drives up to 18TB now. But even a portable 2TB USB-powered 2.5in external hard drive is only $60, so this is a cheap and robust strategy.
Why not have one of those drives off site, and rotate every so often? Carry the drive with you when you swap, so that the original and all backups are not in the same place at the same time.
I have three external drives. Originally I planned to keep two offsite, but I don't have an offsite office anymore.
With tape backup, you might keep a tape in cold storage for years.
The other strategy is with tape rotation. You need to have about 30x the tape storage as you have online storage, so you can keep 7 yearly backups, 12 monthly, 6 weekly (all full backups), and 7 - 14 daily incremental backups.
So if the web server gets hacked, the hacker can only write to the bucket, but has no way to know what is already there or access anything in it.
Saving diffs/snapshots will solve the issue. As long as the file doesn't change the cost is almost 0.
I believe that I would detect unexpected binary blobs. Of course this depends on me writing the programs correctly, and a lot of other assumptions, but it might suggest a way to protect backups.
My backup "drive" is anything but passive.
One day, I asked if they ever had performed a recovery off of the tapes, as I questioned if the tapes were even being written to. (NOTE: Backups was not my job at all. )
Why had I brought this up? I would be in the server room and never saw the blinky lights on the tape...well.. blink. Everyone literally laughed at me, thought was a grade A moron.
A year later, servers died... Pop'ed in the tape... Blank. No worries, they had thousands more of these tapes. Sadly, they were all MT. They had to ship hard drives to a recovery shop, and it was rather expensive.
I left shortly after this.
A note for anyone else in a similar situation - a good team doesn't ridicule someone for questions like these. A responsible leader should have cited a time in the past that they did a restore or a spot check, and no one should have laughed. The laughter sounds like masked fear or embarrassment.
This goes for any team. "How do we know this function of our job does what we think it does?" You should have an answer. Now, I've only worked in R&D software and not in IT. But IMO IT teams should work the same way in this regard.
A good team won't ridicule any questions. If you're on a team that ridicules your questions, that's a huge red flag. Get out as soon as possible!
... or assigned the engineer asking questions the task of figuring it out!
Modern tape drives (like LTO) will at least do a read after write so you should never end up with blank tapes after a backup. But still no excuse not to do restore tests.
And make sure you're not storing your backup decryption key in the same backups that are encrypted with that key. Likewise, make sure you're doing restore tests on a "cold" system that doesn't already have that decryption key (or other magic settings) loaded, otherwise you may find out in a disaster that your decryption key is inaccessible.
> as I questioned if the tapes were even being written to.
> I would be in the server room and never saw the blinky lights on the tape...well.. blink.
All the stuff in this article is great scenarios to think about (recovery time, key location, required tools), but it's still all at the backup design phase. The headline of "test your backups" seems misleading -- you need to design all these things in before you even try to test them.
It seems like a real problem here is simply that backup strategies were often designed before Bitcoin ransomware became prevalent, and execs have been told "we have backups" without probing deeper into whether they're the right kind of backup.
In other words, there's no such single thing as "having backups", but rather different types of backup+recovery scenarios that either are or aren't covered. (And then yes, test them.)
Also the article doesn't seem to consider the fact that some hackers are now threatening release, not just destruction. Embarrassing emails, source code, and trade secrets. Backups won't help at all.
When I worked for a large insurance firm, we would run drills every 6 months to perform off-site disaster recover and operational recovery tests to validate our recovery processes. Everything was tested from WAN links, domain controllers, file backups, mainframe recovery and so much more. We were more or less ready for a nuke to drop.
Obviously this costs money, but if you're an insurance firm, not being able to recover would cost way more than running DR and OR recovery drills every 6-12 months.
Exactly. It's that mentality which drove me to small-scale contract IT work for smaller "mom and pop" organizations. Give them a fair price and do good work and most of them are happy to have your services, treat you with respect, and are often more than happy to trade knowledge and services for equivalent exchange of same. This can lead to much "win /win/everybody wins!"
And if you take a contract that plays out in an unsatisfactory way, it's easy to simply turn down further contracts from the one problematic customer. More time to give your loyal customers, or hunt down a better customer to replace the bad one. ;)
Inexperienced, will buy anything that sounds good and trustworthy, no matter whether snakeoil or real deal, because they don't know better. That is most mom&pop shops.
Burned, will buy/do nothing, because when they were inexperienced they were sold/told crap. Now they trust no-one and also think they can save money.
Experienced, when they had a real disaster in the burned stage, recognized their lack of proper tools and manpower as a reason. Now they try to evaluate suggestions properly through inhouse expertise. Only possible if large enough.
Since switching to contract IT work and coming in much more direct contact with "mom & pop" shops than I did in prior years, I've come to realize that most "mom & pop" shops are far more business savvy than they're often given credit for. They mostly just don't have access to any sort of fair and reasonably priced IT folk who ain't tryin' to scam them outta house and home.
I've found that by offering that fair price and quality work, I can gain a level of loyalty that results in me not even needing to advertise my services to have more than enough work and profit to keep me goin' and happy with my career choice. "Word of mouth" is by far the best advertising you could ever ask for anyhow… Nothin' beats trust for generating "brand loyalty" and return business.
> Experienced, when they had a real disaster in the burned stage, recognized their lack of proper tools and manpower as a reason. Now they try to evaluate suggestions properly through inhouse expertise. Only possible if large enough.
I've come across these folk as well. They also tend to be able to recognize instantly when they're not bein' taken advantage of. This type has always been a good loyal customer type worth putting in a bit of extra effort for, too. Having been "burned" before, they recognize the value of payin' a fair price to an honest hardworkin' tech.
> Burned, will buy/do nothing, because when they were inexperienced they were sold/told crap. Now they trust no-one and also think they can save money.
The saddest example of the three, because they'll continue to suffer because their trust had been abused.
I'm now convinced most people are overworked and most SWE projects are overcommitting. I mean I'm currently the sole responsible for two codebases of nearly 300K LOC total, rebuilding the one into the other. At my previous jobs this would involve a fully staffed team of 4+ engineers, tester, product owner, etc - and they could probably use more.
Nah, it gets way outperformed by the "too big to fail bailout-monkey" ETF.
Unfortunately you need political connections to know the composition of that ETF.
Some of my customers have thousands of VMs in their cloud, and they aren't cloned cattle! They're pets. Thousands upon thousands of named pets. Each with their individual, special recovery requirements. This then has a nice thick crust of PaaS and SaaS layered on top in a tangle of interdependencies that no human can unravel.
Some resources were built using ARM templates. Some with PowerShell scripts. Some with Terraform. A handful with Bicep. Most with click-ops. These are kept in any one of a dozen source control systems, and deployed mostly manually by some consultant that has quit his consulting company and can't be reached.
Most cloud vendors "solve" this by providing snapshots of virtual machines as a backup product.
Congratulations big vendors! We can now recover exactly one type of resource out out of hundreds of IaaS, PaaS, and SaaS offerings. Well done.
For everything else:
WARNING: THIS ACTION IS IRREVERSIBLE!
Okay, you got things restored! Good job! Except now your DNS Zones have picked a different random pool of name servers and are inaccessible for days. Your PaaS systems are now on different random IP addresses and noone can access them because legacy firewall systems don't like the cloud. All your managed identities have reset their GUIDs and lost their role assignments. The dynamically assigned NIC IP addresses have been scrambled. Your certificates have evaporated.
"But, but, the cloud is redundant! And replicated!" you're just itching to say.
Repeat after me:
A synchronous replica is not a backup.
A synchronous replica is not a backup.
A synchronous replica is not a backup.
Do you know what it takes to obliterate a cloud-only business, permanently and irreparably?
I won't repeat them here, because like Voldemort's name it simply invites trouble to speak them out loud.
The whole post should be printed out and pinned to the Kanban/Scrum/whatever board of every infrastructure/DevOps team, but this sentence in particular. This property of Azure (and I imagine every other cloud provider) was one of the nastier fights we had with the guys who run the on-prem firewalls.
I never quite understood why spending more money is better if it comes from a different bucket. I'm sure there's some explanation that only makes sense if you don't look too closely.
I'm not sure how that really benefits an organization with a time horizon that is longer than the time it takes to depreciate a server. You still get to write off the full price of a server, it just takes a few years longer (and it was probably cheaper!). But then again, I'm not in finance...
A stack is spawned from a database backup and once it passes tests, replaces the previous one.
Not sure how smart this all is but my goal is to learn through application.
In order to not lose data, you can't have any writes between the time when the backup was taken and the present, or you need code which reconciles additional state and adds it onto the backup before switching over.
Normally, backup restoration is done during a maintenance window where the site is disabled so no writes can happen, and then usually a window of writes are lost anyway (i.e. 'last X hours, since the backup was taken')
For your use-case, do you just have very few writes? Do you lose writes? Do you have some other clever strategy to deal with it?
A typical bank / credit union may only serve one town. As such, it would be socially acceptable to designate 3am to 4am as a regular maintenance window where services are shutdown.
I like this approach, although risky if you mean you routinely replace the production db.
My preferred setup is to automate restores into a pre-prod environment, apply data masking and run tests there. It's not a replacement for full DR exercises, but at least it automates the backup verification process as part of your build system.
The stack replaces the previous one or the backup replaces the previous one? While having a single backup is a good start, you might want to consider keeping several backups so you can restore from, say, a data entry error that you discover two months after it happens.
Database images are immutable and a history of them are kept.
Three copies of your data. Two "local" but on different mediums (disk/tape, disk/object storage), and at least one copy offsite.
Then yes, absolutely perform a recovery and see how long it takes. RTOs need to be really low. Recovering from object storage is going to take at least a magnitude more time than on-prem.
Also, storage snapshots/replications are not backups, stop using them as such. Replicating is good for instant failover, but if your environment is hacked they are probably going to be destroyed as well.
Your manager should be able to write a similar email.
I get it doesn't make sense, but that's corporate America for you.
That said, be careful of the gift card route. Depending on the amount you can find yourself in the wrong side of the IRS that way.
> Whether an item or service is de minimis depends on all the facts and circumstances. In addition, if a benefit is too large to be considered de minimis, the entire value of the benefit is taxable to the employee, not just the excess over a designated de minimis amount. The IRS has ruled previously in a particular case that items with a value exceeding $100 could not be considered de minimis, even under unusual circumstances.
Which about matches with what I've seen at BigCo.
$40 box of tools as a gift? Did not show up on my paycheck.
$150 electronic device as a gift? Showed up on my paycheck.
In the past few years, guidance has shifted toward accounting for employer-provided food with employee income as well.
2-3 year email retention on corp email.
Paper files for sensitive client info (or don't keep it).
We can reinstall office / windows / active director etc.
Mandatory 2FA on google suite?
Git codebases on github etc for LOB apps (we can rebuild and redeploy).
We use the lock features in S3 for copies of data that must be kept. Not sure I can even unlock to delete as account owner without waiting for timeouts.
I work at a big company that does this, but with 6 months. While I understand why they would do that, often some knowledge is lost with these emails. And usually I don't know what's knowledge and what's junk before I need it.
On the other hand, it's a good way to make sure that your processes are written somewhere and people don't rely too much on email archiving. Sadly, that's something I didn't realize until it was too late.
Most of the remaining suggestions aren't relevant to ransomware even if they're otherwise mostly fine recommendations. 2FA won't stop ransomware, or data destruction. Redeploying code and reinstalling active directory doesn't restore customer or user databases. Paper files are not accessible when they're needed, are easily damaged or misplaced, and cost a lot to store and index (if you're referring to keeping them as backups then yes you're making a form of the argument of the post just in a very expensive and inconvenient way). Read-only S3 copies are almost certainly falling in the realm of backups... but is also a relatively expensive way to do it for most organizations larger than a start-up.
Offline and offsite backups are the cheapest, most effective tool for keeping copies of your data in your companies possession due to unforeseen circumstances and they protect against a huge number of potential events beyond just ransomware. It's negligent IMO for executive officers of a company to not have invested in a solid, tested, and comprehensive backup solution.
Attacks involving credential stealing almost never involve malware.
Source: The 2020 Microsoft Digital Defense Report https://www.microsoft.com/en-us/security/business/security-i...
The S3 options around this work well with object lock or similar.
I remember the old school tape methods (we had a rotation of who took the tape home). This was truly offline
It won’t stop a root account compromise unless you’ve got a multi-account setup going (as they could edit the bucket policy).
But if you’ve not got any monitoring, they could also just remove the lock on the objects without you noticing and wait for the timeout themselves.
What are the hackers gonna do? Sue you?