I've seen several organisations decide they need to encrypt backups. The first thing I ask is "where is that key saved" and the answer has always been "in a file, written to those backups". It's amazing how many times "encryption" leaves someone one ransomware away from losing everything.
print the keys and give them to the cfo to store as asset, they have the systems in place to keep paper around for decades.
I see no reason for this, whoever is head of the department responsible for disaster recovery should just stick a backup printout of the keys in a local safe deposit box and get the right names on the account.
Our experiences must differ substantially here. I've worked at a few startups over the past two decades, and I would not trust any of the finance people from any of those companies to practice good operational security or be reliably available as a cog in a critical real-time operational path.
They were consistently the kinds of people who would treat a locked filing cabinet behind a locked office door as the height of security, and in many cases left important private documents like checks revealing salaries completely insecure, face down on recipients unattended desks.
They have often been the staff least reliably in the office, and least accessible at any given moment - especially after hours or on weekends.
In my view it's just exposing the key to more risk of getting left out or forgotten in a poorly secured, often unattended filing cabinet behind trivially picked locks. Just to make it available through someone frequently absent on vacation. Good thing the security will likely suck, because you'll probably have to pick those locks when you need the copy and the CFO is skiing down a mountain.
This is an issue for the department in charge of performing backups and restores. They will already have access to the key to do the job, and they need to be accessible in times of crisis. Why wouldn't the COO be the right person if operations is in charge of disaster recovery? It's their ass on the line when they can't do a restore on the weekend. The CFO and the entire finance department has zero expectation of being available after hours or on holidays.
Not as a day to day occurrence. But as a last line of defence, absolutely. No other department has better routines in place to keep papers safe and secure for years through rounds of layoffs, restructurings, buyouts and everything else a company might go through.
It's easy to read, contrary to a DVD (you need a DVD drive).
It's also cheap.
Why make things complex when they could be simple.
There are Sony BD-XL 128GB discs that claim 50 years of shelf life - however I'm not sure if they are suitable for archives. However as long as they are regularly tested - they could be decent enough for cold-storage.
>It's Oracle cloud - you can't really test them without restoring a snapshot (overwriting your current data)
Jesus, if thats true, what a shit product. AWS RDS doesn't even let you restore a backup to an already running database, it always creates a new one.
Our accounting software offers end-to-end encryption, but it's not enabled by default. When the user tries to enable it they see a bold message saying, "Warning: In this mode we cannot reset your password. If you forget your password you will lose access to your data. Are you sure you want to continue?"
LOTS of people don’t have backups.
You could say if that happens it's a badly designed test, and I agree. But it goes to show that testing entire infrastructures partly run by service providers is not easy.
Wouldn't performing dry-run with a checklist of actions have kept this from happening or at least alerted them to the deficiencies?
This is the world we’ve got today.
This company was screwed as soon as they did business with Oracle. A company whose entire business model is making things as inscrutable as possible so you'll pay them more money in support contracts.
So the question is, why did the company not think of testing their backups, and generally doing a full validation of the system before committing to it for months and millions of $ in business? What sort of workflows, processes and culture in the company allowed this to happen? Probably the cheap sort.
This is in fact the very thing that a good agile devops shop will prevent by not relying on Oracle to handle it for them. That's why the comment I was replying too looked kneejerk and not the result of having read the article to me.
The point you add encryption and fuck up the keys is the point you are able to recover and try again, instead of waiting for days, weeks or months like these guys - you check it immediately - if you didn't get a copy of the key or it's invalid, that's ok, your old unencrypted backup isn't out of date yet.
I have no idea the keysize or speed that is practical. But it’s hardly impossible to just dictionary attack using your old RAM as the “list”.
While its not always possible, it's a good idea to use at least two completely different backup methods, avoiding single points of failure like in this case. For example, use a logical volume manager to make frozen copies of your filesystems, and back those up using something low level.
This is a system administration failure, not an Oracle failure. Basically they didn't test their recovery strategy.
That's the Oracle failure. "Enabling Oracle encryption according to their instructions, failed to do what it should, and we lost access to all our servers and all our backups because of it".
There's also a separate failure of administration on the client side of not practising working with encryption before using it in production, and not testing recovery and noticing the wallet was empty and the backups were unusable before it became a disaster, but the fact that it did become a disaster is a failure of Oracle's system.
From one of the OP's comments, "Oracle support told him, step by step, to encrypt the database, which happened ages ago and is the root of all of this."
That wallet is the responsibility of whomever created it. Just because it's empty now, doesn't mean it's always been empty. In addition to that, what if the person who initially encrypted everything with TDE, used a local only wallet? This would explain why the ewallet.p12 file is empty. Oracle recommends that you store your master key in the ewallet.p12 file and not the cwallet.sso file, which is the auto-open file, but only on the workstation it was created on.
There could have been multiple failures here with multiple parties and not just Oracle.
It's not an on-prem setup where the customer has full control to manage/test backups, it's at least some level of managed service from Oracle.
So at the least the commands used should have offered/enforced creation of a key backup as part of the encryption process (in the way that bitlocker does for Windows disk encryption) to reduce the risk of something like this happening.
This is way more than enough to turn this over to lawyers and let the court decide who is to blame.
Yes, the company could have checked their recovery strategy but if Oracle claims the key gets written to disk and it wasn't, there is a good chance for at least some compensation from Oracle.
But being a cloud service, it would be odd if the encryption key would be available without any interaction by the owners.
If this was something they ran locally, yet it'd probably be their fault. But this is a hosted solution with a major bug. I hope some Oracle people are searching through seeing how many other customers have this empty wallet bug, and fixing it now.
They could generate a new key + wallet, which could probably then be made to work for the running instance.
All the backups though, would probably need to be opened with the current (missing) key+wallet in order to re-encrypt them.
Might be doable with the current key+wallet that's in memory, but also it might not be. eg might not be able to change the software on the running system, in which case ouch... places that need historical backups (eg for legal purposes) would be in bad place.