Hacker News new | past | comments | ask | show | jobs | submit login
The day I wiped a production database table (pboyd.io)
25 points by 0x54MUR41 on June 5, 2022 | hide | past | favorite | 9 comments



A few things I like to do to avoid this:

- read only connections / users in general

- append _prod / _stage to the database name to avoid ambiguity (and ensure psql is configured to show the database name at the prompt)

- configure editors to color code sessions where possible (red for prod, amber for stage and green for dev/local)

- if a manual data modification is necessary, use a transaction and have commands prewritten to verify the intended outcome

In terms of underlying infrastructure I also very much appreciate deletion protection in terraform providers for any data stores, forcing a two phase destruction (one plan/apply to disable the protection and a second to actually destroy)


I did something similar about ~18 months ago; brought up the wrong window and I deleted a table at the bottom of the relationship hierarchy (via the SQL Server Management Studio GUI no less, so I had to click through). Half a second later I realized I was on the wrong server and I tried to cancel the operation. My stomach churned. It did take me a moment to snap out of things (this table had like 200 million rows of ephemeral data that was scraped from the web), but I came to the conclusion that the data is either completely gone forever, or I can talk to co-workers to see what can be done.

Someone suggested we restore the database from its last backup (which we do daily with a few weeks retention), then do a "SELECT INTO FROM" to re-fill the table. We lost a couple hours of data and were soon back online.

I am sure there are more nuggets to take away from this story, but I took: always maintain a robust set of backups and let people know about the problem so you can at least try to work out a solution.


For me and many others, I'm sure this brings back the sweaty palms from our own version of this!


Resizing a partition on a production Nagios server back in the day.

I realized that Fdisk wasn’t using the units my muscle memory had memorized, but not until I rebooted and everything was dead.

My manager and I decided we’d record it as an unplanned backup restore test and got to check some compliance boxes.

Yay backups! The lesson stuck and I got to learn it without consequences! Many are not that lucky.


Very much so.

My own story would be ... "The day I wiped a production database .. completely .. on Christmas eve."


Yup clicking drop table in phpMyAdmin ~15year ago because the button was next to optimize, after a party-night.

BTW: The table was from an insurance company and ~all sent in accidents from that day where deleted, around 500 "customers"...sweaty palms is an understatement.


A while ago a small company I was working for was acquired by a huge national company. Through "synergy" I ended up in working in BI doing ETL from the systems we used into their DB2 warehouses. When there were errors, we'd manually go through the error log and manually insert the erroneous rows. During this transition I was given a set of credentials that, apparently were used in the actual, daily production job, without me being aware. Well, I managed to lock that account out with 3 bad password attempts. It was a huge ordeal; everyone of my director and VP were reprimanded. Noone publicly blamed me because it was, obviously, stupid to run production jobs with credentials you give to some guy watching ETL that day, but I was still quite embarrassed.


This could happen to almost everyone and maybe its time to review that backup-restore process to take into account restoring individual tables.


Absolutely. Having done this blind and also with a well documented process I can say the latter is far smoother.

The restore process should be written in an incident response run book and practiced by the on-call people. (Maybe in your quarterly disaster recovery tests?) The terror of data loss fades significantly when you come across a clean, one page restore process and you’re able to get the data back without a lot of effort.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: